[SPCA] Synchronization

2026-06-13 06:01:19 +02:00 · 2026-01-22 07:35:49 +01:00
parent 5dffd7391c
commit f7eeac5470
9 changed files with 40 additions and 9 deletions
@@ -0,0 +1,13 @@
+void acquire( int *lock ) {
+    while ( TAS( lock ) == 1 );
+}
+
+void acquire_tatas( int *lock ) {
+    do {
+        while ( *lock == 1 );
+    } while ( TAS( lock ) == 1 );
+}
+
+void release( int *lock ) {
+    *lock = 0;
+}
@@ -1,4 +1,4 @@
-\subsubsection{Synchronization}
+\subsubsection{Relaxing Sequential Consistency}
 As we have outlined, sequential consistency may not be desirable when trying to build a high-performance system.
 We thus may want to relax sequential consistency.
 A primary reason to this is out-of-order execution giving a massive speed boost, as we do not have to wait for slow memory accesses to finish
@@ -22,7 +22,3 @@ This instruction stops the CPU reordering past it, i.e. any instruction before t
 However, any instructions past the fence are fair game and can be reordered (i.e. two instructions behind the fence can be reordered).

 If we only need this for stores or loads, we can use \texttt{lfence} or \texttt{sfence}, respectively.
-
-\content{TAS} (Test-and-Set)
-
-\content{CAS} (Compare-and-Swap)
@@ -0,0 +1,21 @@
+\newpage
+\subsubsection{Multicore synchronization}
+There are two main ways to synchronize, which are:
+\begin{enumerate}
+    \item \bi{Atomic operations} such as \texttt{TAS}, \texttt{CAS}, etc.
+          It does still have ordering constraints specified in the memory model
+    \item \bi{Interprocessor interrupts} (IPIs) This invokes the interrupt handler on remote CPU,
+          but is VERY slow (500+ cycles) and thus often avoided, except in the OS
+\end{enumerate}
+
+\content{TAS} (Test-and-Set) We can only set to the memory location using TAS if said location is $0$.
+It can thus be used for a mutex, with a simple spinlock, which is simple to implement and often the fastest if the lock isn't held for long.
+Since we most commonly do not read a value of \texttt{0} in the lock memory location, we can use a TATAS (Test And Test-and-Set) lock to reduce the performance overhead.
+
+\inputcodewithfilename{c}{}{code-examples/03_hw/01_tas.c}
+
+A word of caution: Do not use TAS to check if a value has changed outside a lock.
+It will most likely not not work in \lC\ and almost certainly not in \texttt{Java} or any higher level languages
+
+
+\content{CAS} (Compare-and-Swap)
@@ -158,10 +158,11 @@ If there are changes and you'd like to update this summary, please open a pull r
 \input{parts/03_hw/06_multicore/00_background.tex}
 \input{parts/03_hw/06_multicore/01_limitations.tex}
 \input{parts/03_hw/06_multicore/02_consistency-coherencey.tex}
-\input{parts/03_hw/06_multicore/03_sync.tex}
-\input{parts/03_hw/06_multicore/04_smp.tex}
-\input{parts/03_hw/06_multicore/05_numa.tex}
-\input{parts/03_hw/06_multicore/06_optim.tex}
+\input{parts/03_hw/06_multicore/03_relaxing-seq-consistency.tex}
+\input{parts/03_hw/06_multicore/04_sync.tex}
+\input{parts/03_hw/06_multicore/05_smp.tex}
+\input{parts/03_hw/06_multicore/06_numa.tex}
+\input{parts/03_hw/06_multicore/07_optim.tex}
 \input{parts/03_hw/07_dev.tex}