mirror of
https://github.com/janishutz/eth-summaries.git
synced 2026-03-14 10:50:05 +01:00
[SPCA] Synchronization
This commit is contained in:
13
semester3/spca/code-examples/03_hw/01_tas.c
Normal file
13
semester3/spca/code-examples/03_hw/01_tas.c
Normal file
@@ -0,0 +1,13 @@
|
||||
void acquire( int *lock ) {
|
||||
while ( TAS( lock ) == 1 );
|
||||
}
|
||||
|
||||
void acquire_tatas( int *lock ) {
|
||||
do {
|
||||
while ( *lock == 1 );
|
||||
} while ( TAS( lock ) == 1 );
|
||||
}
|
||||
|
||||
void release( int *lock ) {
|
||||
*lock = 0;
|
||||
}
|
||||
@@ -1,4 +1,4 @@
|
||||
\subsubsection{Synchronization}
|
||||
\subsubsection{Relaxing Sequential Consistency}
|
||||
As we have outlined, sequential consistency may not be desirable when trying to build a high-performance system.
|
||||
We thus may want to relax sequential consistency.
|
||||
A primary reason to this is out-of-order execution giving a massive speed boost, as we do not have to wait for slow memory accesses to finish
|
||||
@@ -22,7 +22,3 @@ This instruction stops the CPU reordering past it, i.e. any instruction before t
|
||||
However, any instructions past the fence are fair game and can be reordered (i.e. two instructions behind the fence can be reordered).
|
||||
|
||||
If we only need this for stores or loads, we can use \texttt{lfence} or \texttt{sfence}, respectively.
|
||||
|
||||
\content{TAS} (Test-and-Set)
|
||||
|
||||
\content{CAS} (Compare-and-Swap)
|
||||
21
semester3/spca/parts/03_hw/06_multicore/04_sync.tex
Normal file
21
semester3/spca/parts/03_hw/06_multicore/04_sync.tex
Normal file
@@ -0,0 +1,21 @@
|
||||
\newpage
|
||||
\subsubsection{Multicore synchronization}
|
||||
There are two main ways to synchronize, which are:
|
||||
\begin{enumerate}
|
||||
\item \bi{Atomic operations} such as \texttt{TAS}, \texttt{CAS}, etc.
|
||||
It does still have ordering constraints specified in the memory model
|
||||
\item \bi{Interprocessor interrupts} (IPIs) This invokes the interrupt handler on remote CPU,
|
||||
but is VERY slow (500+ cycles) and thus often avoided, except in the OS
|
||||
\end{enumerate}
|
||||
|
||||
\content{TAS} (Test-and-Set) We can only set to the memory location using TAS if said location is $0$.
|
||||
It can thus be used for a mutex, with a simple spinlock, which is simple to implement and often the fastest if the lock isn't held for long.
|
||||
Since we most commonly do not read a value of \texttt{0} in the lock memory location, we can use a TATAS (Test And Test-and-Set) lock to reduce the performance overhead.
|
||||
|
||||
\inputcodewithfilename{c}{}{code-examples/03_hw/01_tas.c}
|
||||
|
||||
A word of caution: Do not use TAS to check if a value has changed outside a lock.
|
||||
It will most likely not not work in \lC\ and almost certainly not in \texttt{Java} or any higher level languages
|
||||
|
||||
|
||||
\content{CAS} (Compare-and-Swap)
|
||||
Binary file not shown.
@@ -158,10 +158,11 @@ If there are changes and you'd like to update this summary, please open a pull r
|
||||
\input{parts/03_hw/06_multicore/00_background.tex}
|
||||
\input{parts/03_hw/06_multicore/01_limitations.tex}
|
||||
\input{parts/03_hw/06_multicore/02_consistency-coherencey.tex}
|
||||
\input{parts/03_hw/06_multicore/03_sync.tex}
|
||||
\input{parts/03_hw/06_multicore/04_smp.tex}
|
||||
\input{parts/03_hw/06_multicore/05_numa.tex}
|
||||
\input{parts/03_hw/06_multicore/06_optim.tex}
|
||||
\input{parts/03_hw/06_multicore/03_relaxing-seq-consistency.tex}
|
||||
\input{parts/03_hw/06_multicore/04_sync.tex}
|
||||
\input{parts/03_hw/06_multicore/05_smp.tex}
|
||||
\input{parts/03_hw/06_multicore/06_numa.tex}
|
||||
\input{parts/03_hw/06_multicore/07_optim.tex}
|
||||
\input{parts/03_hw/07_dev.tex}
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user