[SPCA] More ASM

This commit is contained in:
2026-01-09 17:05:00 +01:00
parent d248e215bb
commit 74786112cf
5 changed files with 54 additions and 0 deletions

View File

@@ -1,3 +1,4 @@
\newpage
\subsection{The syntax}
There are two common styles: AT\&T syntax (common on UNIX) and Intel syntax (common on Windows)
@@ -9,3 +10,50 @@ The state that is visible to us is:
\end{itemize}
To view what \lC\ code looks like in assembly, we can use \texttt{gcc -O0 -S code.c}, which produces \texttt{code.s} which contains assembly code.
\subsubsection{Registers}
\texttt{x86} assembly is a bit particular with register naming (register names all start in \%).
The initial 16-bit version of \texttt{x86} had the following registers (sub registers are registers that can be used to access the high
(\texttt{h} suffix) or low (\texttt{l} suffix) half of the register. Only registers ending in \texttt{x} feature these sub registers.
They, as well as \texttt{\%si} and \texttt{\%di} are general purpose):
\begin{tables}{lll}{Name & Sub-registers & Description}
\texttt{\%ax} & \texttt{\%ah}, \texttt{\%al} & accumulate \\
\texttt{\%cx} & \texttt{\%ch}, \texttt{\%cl} & counter \\
\texttt{\%dx} & \texttt{\%dh}, \texttt{\%dl} & data \\
\texttt{\%bx} & \texttt{\%bh}, \texttt{\%bl} & base \\
\texttt{\%si} & - & Source index \\
\texttt{\%di} & - & Destination index \\
\hline
\texttt{\%sp} & - & Stack pointer \\
\texttt{\%bp} & - & Base pointer \\
\texttt{\%ip} & - & Instruction pointer \\
\texttt{\%sr} & - & Status (flags) \\
\end{tables}
When the architecture was extended to 32-bit, all registers previously available were retained and a 32 bit version of each was introduced with the prefix \texttt{e}.
In other words, any 16 bit code would still work as previously, as e.g. the \texttt{\%ax} register was simply now the lower 16 bits of the \texttt{\%eax} register.
The same happened again when extending to 64-bit, only this time the \texttt{r} prefix was used.
So, the register \texttt{\$eax} was now the lower 32 bits of \texttt{\%rax}.
Additionally, the following registers are also available, with \texttt{X} to be substituted with 8 through 15: \texttt{\%rX} and the lower 32 bits \texttt{\%rXd}
\subsubsection{Instructions}
Instructions usually have a 3 letter prefix with a one letter postfix, where the postfix indicates the number of bytes.
The following postfixes are available: \texttt{b} (byte, 1 byte), \texttt{w} (word, 2 bytes), \texttt{l} (long word, 4 bytes) and \texttt{q} (quad, 8 bytes).
The following options can be passed for source and destination: Registers,
\content{Immediates} To use a constant value (aka Immediate) in an instruction, we prefix the number with \texttt{\$} (following number is decimal).
To use hex, we can use \texttt{\$0x}, etc.
\content{Memory addresses} To treat a register as a memory address, use parenthesis, e.g. \texttt{(\%rax)} interprets the value of \texttt{\%rax} as a memory address.
The instruction will then read the number of bytes, as specified by the postfix of the instruction.
The full syntax for memory address modes is \texttt{D(Rb, Ri, S)}, where
\begin{itemize}[noitemsep]
\item \texttt{D}: Displacement (constant offset), should be 0, 1, 2 or 4 bytes % TODO: This seems to conflict with examples
\item \texttt{Rb}: Base register (to which offsets, etc are added). Can be any of the 16 integer registers
\item \texttt{Ri}: Index register: Any, except for \texttt{\%rsp} (and \texttt{\%rbp} is also rarely used)
\item \texttt{S}: Scale factor (1, 2, 4 or 8, to correct offsets)
\end{itemize}
The computation that then happens is the following: \texttt{Mem[ Reg[Rb] + S * Reg[Ri] + D ]}