[SPCA] Start multicore section

This commit is contained in:
2026-01-21 11:27:04 +01:00
parent dcc430aff8
commit beca74b40e
10 changed files with 63 additions and 2 deletions

View File

@@ -60,5 +60,5 @@ For example,
\begin{itemize}
\item \textbf{Interrupts} are actions like network data arrival or hitting a key on the keyboard
\item \textbf{Hard Reset Interrupts} are executed by hitting the system reset button
\item \textbf{Soft Reset Interrupts} ate caused by, for example, hitting \verb|CTRL|+\verb|ALR|+\verb|DEL|
\item \textbf{Soft Reset Interrupts} are caused by, for example, hitting \verb|CTRL|+\verb|ALR|+\verb|DEL| (on Windows)
\end{itemize}

View File

@@ -0,0 +1,25 @@
\newpage
\subsection{Multi-Core}
\subsubsection{Background}
In the early days of computer hardware it was fairly easy to get higher performance due to the rapid advances in transistor technology.
However, today, what is known as Moore's Law (i.e. that the transistor count of integrated circuits doubles every two years).
However, due to power constraints and the slowing down of advances in transistor technology,
the transistor count growth has slowed down quite a bit since the beginning of the century and is predicted to further stagnate as time goes on.
This leads to various issues, among others, the performance of CPUs isn't going up as quickly anymore as it used to.
Additionally, due to power constraints, building faster and faster single-core CPUs is not possible and the advances in that field have slowed to a crawl.
To mitigate and offset these issues, manufacturers started to add multiple cores to parallelize operations.
This however brings a whole host of new issues with it, for example, how do you make sure that no data races occur,
how do you schedule, etc? These questions have mostly been answered in the course Parallel Programming, so we will not cover that here.
The only reason transistor count is still growing at a seemingly constant rate today is that manufacturers manage to cram more and more cores into a CPU.
But even that has slowed down in recent years.
While in 2019 a highest core count AMD EPYC CPU (i.e. the EPYC 7742 from the ROME family) had 64 Zen 2 cores,
in 2025 the highest core count EPYC CPU (i.e. the EPYC 9965 from the EPYC Turin Dense Family) had 192 Zen 5c cores,
where the highest full core CPU was the EPYC 9755 (from the EPYC Turin family), which had 128 Zen 5 cores.
The way they manage this while not hitting the power wall is by making the CPUs physically larger.
While a consumer Ryzen 9 9950X3D (the fastest consumer CPU at the time of writing) easily fits into the palm of even a small hand,
an EPYC Turin CPU is so large that it covers most of even a big hand.

View File

@@ -0,0 +1,21 @@
\subsubsection{Limitations}
\content{The Power Wall} More and more transistors need more and more power, thus leading to power delivery and dissipation becoming and issue.
To compute the power dissipation, use the formula $P_{diss} = P_{dyn} + P_{leak} + P_{short}$,
where $P_{dyn} = C V^2 f$ (with $C$ the capacitance, $V$ the supply voltage and $f$ the processor frequency) is the dynamic power,
$P_{leak}$ the leakage power (see DDCA) and $P_{short}$ the short circuit power while switching.
At some point the chip becomes almost impossible to cool. A great example of a CPU series that suffers from this is the Intel Rocket Lake CPUs.
The Intel Core i9-14900K is notoriously hot-running, using almost 300 watts for a very small chip and thus runs very hot.
Thus, to further increase performance, chip designers are trying to make the hardware more efficient, which allows them to further boost performance with extra power headroom.
\content{The Memory Wall} Between 1985 and 2005, CPU performance has increased on average by 55\% a year, whereas memory throughput has only increased by roughly 10\% a year.
Thus, performance has more and more become limited by memory performance rather than pure CPU performance and to this day is the largest overhead in most applications.
\content{The ILP Wall} While it is possible to improve single core performance using instruction-level parallelism,
this has been thoroughly exhausted and is not a feasible way to significantly improve CPU performance.
Around 2003, all of these walls were hit simultaneously, as they hit a power wall and thus could not clock the processors any higher,
the memory access times were the limiting factors and ILP was almost completely exhausted, as not enough parallel instructions existed in code.
Current trends are a reduction in clock frequency in favour of more parallelism in the hardware, e.g. by providing more cores, or better caching, branch prediction, etc.

View File

@@ -0,0 +1,9 @@
\subsubsection{Coherency and Consistency}
\fancydef{Coherency} The values in cache all match each other and the processors all see a coherent view of the memory
\fancydef{Consistency} The order in which changes are seen by different processors is consistent
Most modern system's CPU cores are caches coherent, i.e. it behaves as if all cores access a single memory array.
This leads to one big advantage: It is easy to program, however is hard to implement in hardware and memory is also slower as a result.
Memory consistency on the other hand is not standardized across companies