\newpage
\subsubsection{Rounding}

The basic idea of Floating Point operations is:
\begin{enumerate}
    \item Compute exact result
    \item Round, so it fits the desired precision
\end{enumerate}

\textit{IEEE Standard 754} specifies $4$ rounding modes: \textit{Towards Zero, Round Down, Round Up, Nearest Even}.

The default used is \textit{Nearest Even}\footnote{Changing the rounding mode is usually hard to do without using Assembly.}, which rounds up/down depending on which number is closer, like regular rounding, but picks the nearest even number if it's exactly in the middle.

Rounding can be defined using 3 different bits from the \textit{exact} number: $G, R, S$
$$
    a = 1.B_1B_2\ldots B_{n - 2}B_{n - 1}\underbrace{G}_\text{Guard}\underbrace{R}_\text{Round}
    \underbrace{X_1X_2\ldots X_{k - 1}X_k}_\text{Sticky}
$$
where $n$ is the number of bits in the mantissa of the format (e.g. $3$ as in the above example of an $8$bit floating point number).

\begin{enumerate}
    \item \textbf{Guard Bit} $G$ is the least significant bit of the (rounded) result (i.e. it is $B_n$)
    \item \textbf{Round Bit} $R$ is the $1$st bit cut off after rounding
    \item \textbf{Sticky Bit} $S$ is the logical OR of all remaining cut off bits $X_i$.
\end{enumerate}

Based on these bits the rounding can be decided (we increment the rounded part if the expression evaluates to true):
\hrmvspace
\begin{align*}
    \text{Round up: } R \land S
     &  &
    \text{Round to even: } G \land R \land \lnot S
\end{align*}

\drmvspace
\content{Example} Rounding $8$b precise results to $8$b precision floating point ($4$b mantissa):

\renewcommand{\arraystretch}{1.2}
\begin{center}
    \begin{tabular}{|c|c|c|c|c|}
        \hline
        \textbf{Value} & \textbf{Fraction} & \textbf{GRS} & \textbf{Incr?} & \textbf{Rounded} \\
        \hline
        $128$          & $1.000|0000$      & $000$        & N              & $1.000$          \\
        $13$           & $1.101|0000$      & $100$        & N              & $1.101$          \\
        $17$           & $1.000|1000$      & $010$        & N              & $1.000$          \\
        $19$           & $1.001|1000$      & $110$        & Y              & $1.010$          \\
        $138$          & $1.000|1010$      & $011$        & Y              & $1.001$          \\
        $63$           & $1.111|1100$      & $111$        & Y              & $10.000$         \\
        \hline
    \end{tabular}
\end{center}
\renewcommand{\arraystretch}{1.0}


\textbf{Post-Normalization}: Rounding may cause overflow. In this case: Shift right once and increment exponent.