eth-summaries/semester3/spca/parts/03_hw/03_caches.tex

\subsection{Caches}

Processors generally improve quicker than memory speed does. Therefore, optimizing memory is necessary, and this is what Caches do.

\content{Structure} Caches can be defined using $S, E, B$ s.t. $S \cdot E \cdot B = \text{Cache Size}$.\\
$S = 2^s$ is the set count, $E = 2^e$ is the lines per set, and $B=2^b$ is the bytecount per cache block.

\content{Address} Using the above, the address can be separated into fields which dictate the cache location:

\begin{center}
    Address:
    \begin{tabular}{|c|c|c|}
        \hline
        tag & set index & block offset \\
        \hline
    \end{tabular}
\end{center}

Since we have $S=2^s$ sets and $B=2^b$ bytes per block, we need $s$ bits for the set index, $b$ bits for block offset. The remaining part (tag) is stored with the cache block and needs to match for a cache hit.

\inlinedef \textbf{Direct-mapped} i.e. $E=1$ (1 cache line per set only).

\inlinedef \textbf{2-way Set-Associative} i.e. $E=2$ (2 cache lines per set).

\content{Example} The importance of caches can quickly be seen when looking at the memory hierarchy:

\begin{tabular}{llllll}
    \toprule
    \textbf{Cache type}          &
    \textbf{What is cached?}     &
    \textbf{Where is it cached?} &
    \textbf{Latency (cycles)}    &
    \textbf{Managed by}            \\
    \midrule
    Registers                    &
    4/8-byte words               &
    CPU core                     &
    0                            &
    Compiler                       \\

    TLB                          &
    Address translations         &
    On-chip TLB                  &
    0                            &
    Hardware                       \\

    L1 cache                     &
    64-byte blocks               &
    On-chip L1                   &
    1                            &
    Hardware                       \\

    L2 cache                     &
    64-byte blocks               &
    On-chip L2                   &
    10                           &
    Hardware                       \\

    Virtual memory               &
    4\,kB page                   &
    Main memory (RAM)            &
    100                          &
    Hardware + OS                  \\

    Buffer cache                 &
    4\,kB sectors                &
    Main memory                  &
    100                          &
    OS                             \\

    Network buffer cache         &
    Parts of files               &
    Local disk, SSD              &
    1{,}000{,}000                &
    SMB/NFS client                 \\

    Browser cache                &
    Web pages                    &
    Local disk                   &
    10{,}000{,}000               &
    Web browser                    \\

    Web cache                    &
    Web pages                    &
    Remote server disks          &
    1{,}000{,}000{,}000          &
    Web proxy server               \\
    \bottomrule
\end{tabular}

\subsubsection{Cache Addressing Schemes}

The cache can see either the virtual or physical address, and the tag and index do \textit{not} need to both use the physical/virtual address.

\begin{center}
    \begin{tabular}{c|c|c}
        \hline
        \textbf{Indexing}  & \textbf{Tagging}  & \textbf{Code} \\
        \hline
        Virtually Indexed  & Virtually Tagged  & VV            \\
        Virtually Indexed  & Physically Tagged & VP            \\
        Physically Indexed & Virtually Tagged  & PV            \\
        Physically Indexed & Physically Tagged & PP
    \end{tabular}
\end{center}