[SPCA] start linking

This commit is contained in:
RobinB27
2026-01-13 15:29:26 +01:00
parent 4629cb75e1
commit d2b10811fd
4 changed files with 121 additions and 0 deletions

View File

@@ -0,0 +1,55 @@
\subsection{Linking}
Linking is the final step in the compilation pipeline: separately compiled object files are combined into an executable.
The advantages of using Linkers are clear:
\begin{enumerate}
\item \textbf{Separate Compilation}: Changing one source file requires only recompiling that file.
\item \textbf{Space Optimization}: Executable code only contains functions (e.g. from libraries) that are actually used.
\end{enumerate}
\subsubsection{Symbol Resolution}
The first step during Linking is Symbol Resolution.
In the context of Linking, all variables and functions are considered \textit{Symbols}. Compilers store all symbol definitions in a \textit{Symbol Table}.
The linker associates symbol references with \textit{exactly one} definition.
\inlinedef \textbf{Symbol types}
\begin{itemize}
\item \textbf{Global Symbols} can be referenced by other modules (e.g. \texttt{non-static} in \texttt{C})
\item \textbf{External Symbols} are referenced globals defined elsewhere
\item \textbf{Local Symbols} are defined and referenced exclusively in one module (e.g. \texttt{static} in \texttt{C})
\end{itemize}
Note: Local linker symbols and local program variables are \textit{not} the same.
\inlinedef \textbf{Symbol strength}
Duplicate symbols either lead to linking errors (\texttt{-fno-common}, the default) or compile (\texttt{-fcommon})
\begin{itemize}
\item \textbf{Strong Symbols} are procedure names and initialized globals
\item \textbf{Weak Symbols} are uninitialized globals (on \texttt{-fcommon})
\end{itemize}
in \texttt{C}, function symbols can explicitly be declared weak using:
\begin{minted}{C}
#pragma weak func
__attribute__((weak))__ func()
\end{minted}
\content{Duplicate Handling} The linker uses these definitions to handle duplicates:
\begin{enumerate}
\item Given multiple strong symbols are illegal
\item Given a strong symbol and multiple weak symbols, pick the strong symbol
\item Given multiple weak symbols, choose an \textit{arbitrary} one
\end{enumerate}
\subsubsection{Relocation}
The second step during Linking is Relocation.
Code and data sections of separate sources are combined, and symbols are relocated from relative locations (in \texttt{.o} files) to absolute locations (in \texttt{.exe} files)

View File

@@ -0,0 +1,64 @@
\subsection{File types}
The most common file types used during compilation:
\begin{itemize}
\item \textbf{Source Code File} (\texttt{.c}) Uncompiled source code in \texttt{C}.
\item \textbf{Relocatable Object File} (\texttt{.o}) Code \& Data in a format ready for linking.
\item \textbf{Shared Object File} (\texttt{.so}) Special object file, can be loaded \& linked dynamically: at load or run time.
\item \textbf{Exectuable File} Code \& Data in a format that can be directly copied into memory \& run.
\item \textbf{Archive files} (\texttt{.a}) concatenate related \texttt{.o} files into one \texttt{.a}, with an index.
\end{itemize}
\content{Alternate names} On Windows, \texttt{.dll} files are used instead of \texttt{.so} and are called \textit{Dynamic Link Libraries}.
\subsubsection{Executable and Linkable Format (ELF)}
The standard unified format for all object files (\texttt{.exe}, \texttt{.o}, \texttt{.so}) in use since UNIX.
\begin{center}
\begin{tabular}{l|l}
\textbf{Section} & \textbf{Content} \\
\hline
ELF header & contains basic information: Word size, byte ordering, file type, machine type \\
Segment header table & page size, virtual address memory segments, segment sizes \\
\texttt{.text} & actual code \\
\texttt{.rodata} & (Read-only data) for example, jump tables \\
\texttt{.data} & initialized global variables \\
\texttt{.bss} & uninitialized global variables \\
\texttt{.symtab} & symbol table, procedure and static variable names, section names \& locations. \\
\texttt{.rel.text} & relocation info for \texttt{.text}, e.g. addresses of instructions that need modifying \\
\texttt{.rel.data} & relocation info for \texttt{.data}, e.g. addresses of pointers that need modifying \\
\texttt{.debug} & info for symbolic debugging (\texttt{gcc -g})
\end{tabular}
\end{center}
% Compact table with only the segment names
% \begin{center}
% \begin{tabular}{|c|}
% \hline
% ELF header \\
% \hline
% Segment header table \\
% \hline
% \texttt{.text} \\
% \hline
% \texttt{.rodata} \\
% \hline
% \texttt{.data} \\
% \hline
% \texttt{.bss} \\
% \hline
% \texttt{.symtab} \\
% \hline
% \texttt{.rel.txt} \\
% \hline
% \texttt{.rel.data} \\
% \hline
% \texttt{.debug} \\
% \hline
% Section header table \\
% \hline
% \end{tabular}
% \end{center}

Binary file not shown.

View File

@@ -125,6 +125,8 @@ If there are changes and you'd like to update this summary, please open a pull r
\newsection
\section{The gcc toolchain}
\input{parts/02_toolchain/00_intro.tex}
\input{parts/02_toolchain/01_linking.tex}
\input{parts/02_toolchain/02_file_types.tex}
% ── Hardware recap ──────────────────────────────────────────────────