Static Linking
Static linking
-
TL;DR: Object files are merely collections of block of bytes. Some of these blocks contain program code, others contain program data , and others contain data structure that guide the linker and the loader. A linker concatenates blocks together, decides on run-time locations for the concatenates blocks, and modifies various locations within the code and data blocks.
-
Static linkers
such asld
program take as input a collection of relocatable object files (*.o), command-line args and generate as output a fully linked executable object file that can be loaded and run. -
The input relocatable object files consist of various code and data sections, where each section is a contiguous sequence of bytes
-
To build the executable, the linker must perform 2 main tasks:
- Step 1.
Symbol resolution
: Object files define and reference symbols, where each symbol corresponds to a function, a global variable, or a static variable (i.e., any C variable declared with the static attribute). The purpose of symbol resolution is to associate each symbol reference with exactly one symbol definition. - Step 2.
Relocation
: Compilers and assemblers generate code and data sections that start at address 0. The linker relocates these sections by associating a memory location with each symbol definition, and then modifying all of the references to those symbols so that they point to this memory location. The linker blindly performs these relocations using detailed instructions, generated by the assembler, called relocation entries.
- Step 1.
Symbol resolution
- The linker resolves symbol references by associating each reference with exactly one symbol definition from the symbol tables of its input relocatable object files.
- Symbol resolution is straightforward for references to local symbols that are defined in the same module as the reference.
- Symbol resolution for global symbols is trickier since multiple object modules might define global symbol with the same name
How Linkers Resolve Duplicate Symbol Names
-
At compile time, the compiler exports each global symbol to the assembler as either strong or weak and the assembler encodes this information implicitly in the symbol table of the relocatable object file.
- Weak symbols: uninitialized global variables.
- Strong symbols: functions and initialized global variables.
-
Linux linkers use the following rules:
- Rule 1: Multiple strong symbols with the same name are not allowed.
- Rule 2: Given a strong symbol and multiple weak symbols with the same name, choose the strong symbol.
- Rule 3: Given multiple weak symbols with the same name, choose any of the weak symbols.
-
Example:
- In this case, linker will generate an error because the strong symbol
main
is defined multiple times (rule 1)
- In this case, linker will generate an error because the strong symbol
-
Example 2:
- In this case, same error because the strong symbol
x
is defined multiple times
- In this case, same error because the strong symbol
-
Example 3:
- If
x
is uninitialized in 1 module, then the linker will quietly choose the strong symbol defined in the other (rule 2). In this case,x
infoo3.c
is chosen. - At run-time, function
f
changes the value ofx
from 15213 to 15212.
- If
Linking with Static Libraries
-
In practice, all compilation systems provide a mechanism for packaging related object modules into a single file called a static library, which can then be supplied as input to the linker.
-
When it builds the output executable, the linker copies only the object modules in the library that are referenced by the application program.
-
For example, a program that uses functions from the C standard library and the math library could e compiled and linked with the following command:
- In fact, C compiler drivers always pass
libc.a
to the linker.
- In fact, C compiler drivers always pass
-
On linux, static libraries are stored on disk in a particular file format know as an archive (
.a
suffix). An archive is a collection of concatenated relocatable object files, with a header that describes the size and location of each member object file. -
Consider the following example:
- First, we create a static library of 2 functions:
- Then we compile and link the input files
main2.o
andlibvector.a
:
- The
-static
argument => linker should build a fully linked executable object file that can be loaded into memory and run without any furthur linking at load time. - When the linker runs, it determines that the
addvec
symbol defined byaddvec.o
is referenced bymain2.o
so it copiesaddvec.o
into the executable. - Since the program doesn’t reference any symbols defined by multvec.o, the linker does not copy this module into the executable. The linker also copies the printf.o module from libc.a, along with a number of other modules from the C run-time system.
- Figure 7.8 summariezs the activity of the linker:
How linkers use static libraries to resolve references
-
During the symbol resolution phase, the linker scans the relocatable object files and archives left to right in the same sequential order that they appear on the compiler driver’s command line. (The driver automatically translates any .c files on the command line into .o files.)
-
During this scan, the linker maintains
- A set
E
of relocatable object files that will be merged to form the executable - A set
U
of unresolved symbols (i.e., symbols referred to but not yet defined) - A set
D
of symbols that have been defined in previous input files. - Initially,
E
,U
, andD
are empty. - For each input file f on the command line, the linker determines if f is an object file or an archive. If f is an object file, the linker adds f to E, updates U and D to reflect the symbol definitions and references in f , and proceeds to the next input file
- If f is an archive, the linker attempts to match the unresolved symbols in U against the symbols defined by the members of the archive. If some archive member m defines a symbol that resolves a reference in U, then m is added to E, and the linker updates U and D to reflect the symbol definitions and references in m. This process iterates over the member object files in the archive until a fixed point is reached where U and D no longer change. At this point, any member object files not contained in E are simply discarded and the linker proceeds to the next input file.
- If U is nonempty when the linker finishes scanning the input files on the command line, it prints an error and terminates. Otherwise, it merges and relocates the object files in E to build the output executable file.
- A set
-
However, the downside is If the library that defines a symbol appears on the command line before the object file that references that symbol, then the reference will not be resolved and linking will fail
- When
libvector.a
is processed,U
is empty, so no member object files fromlibvector.a
are added toE
. Thus, the reference toaddvec
is never resolved and the linker emits an error message and terminates.
- When
-
The general rule for libraries is to place them at the end of the command line. If the members of the different libraries are independent, in that no member references a symbol defined by another member, then the libraries can be placed at the end of the command line in any order
-
If, on the other hand, the libraries are not independent, then they must be ordered so that for each symbol s that is referenced externally by a member of an archive, at least one definition of s follows a reference to s on the command line
-
Example: suppose
foo.c
calls functions inlibx.a
andlibz.a
that call functions inliby.a
: -
Example 2: suppose foo.c calls a function in
libx.a
that calls a function inliby.a
that calls a function inlibx.a
. Thenlibx.a
must be repeated -
Example 3: p.o → libx.a → liby.a and liby.a → libx.a → p.o
Relocation
- Once the linker has completed the symbol resolution step, it has associated each symbol reference in the code with exactly one symbol definition (i.e., a symbol table entry in one of its input object modules). At this point, the linker knows the exact sizes of the code and data sections in its input object modules. It is now ready to begin the relocation step, where it merges the input modules and assigns run-time addresses to each symbol
- Relocation consists of 2 steps:
1.Relocating sections and symbol definitions
: the linker merges all sections of the same type into a new aggregate section of the same type. For example, the .data sections from the input modules are all merged into one section that will become the .data section for the output executable object file. The linker then assigns run-time memory addresses to the new aggregate sections, to each section defined by the input modules, and to each symbol defined by the input modules. When this step is complete, each instruction and global variable in the program has a unique run-time memory address.2.Relocating symbol references within sections
: , the linker modifies every symbol reference in the bodies of the code and data sections so that they point to the correct run-time addresses. To perform this step, the linker relies on data structures in the relocatable object modules known as relocation entries.
Relocation entries
- When an assembler generates an object module, it does not know where the code and data will ultimately be stored in memory.
- Whenever the assembler encounters a reference to an object whose
ultimate location is unknown, it generates a
relocation entry
that tells the linker how to modify the reference when it merges the object file into an executable. - Relocation entries for code are placed in
.rel.text
. - Relocation entries for data are placed in
.rel.data
. - Figure 7.9 shows the format of an ELF relocation entry:
offset
: section offset of the reference that will need to be modifiedsymbol
: identifies the symbol that modified reference should point toaddend
: a signed constant that is used by some types of relocations to bias the value of the modified referencetype
: ELF defines 32 different relocation types, here we’re concerned with only the 2 most basic types-
R_X86_64_PC32
: Relocate a reference that uses a 32-bit PC-relative address (recall from Machine-level Representation Control section) -
R_X86_64_32
: relocate a reference that uses a 32-bit absolute address (recall from Machine-level Representation Control section) -
=> These 2 type support the x86-64 small code model (which assumes the total size of the code and data is smaller than 2 GB)
-
Relocating symbol references
-
Linker’s relocation algorithm
- Assume that each section
s
is an array of bytes, each relocation entryr
is astruct
of typeElf64_Rela
(defined in Figure 7.9). - Assume that when the algorithm runs, the linker has already chosen runtime addresses for each section (ex:
.data
,.text
, …) (denotedADDR(s)
) and each symbol (denotedADDR(r.symbol)
)
- Assume that each section
-
Example: consider the C program from figure 7.1
- Shows the disassembled code from
main.o
by theobjdump
tool:
- Shows the disassembled code from
-
Instructions generated:
-
The
main
function references two 2 global symbols,sum
andarray
. For each reference, the assembly has generated a relocation entry (stored inrel.text
orrel.data
) -
The relocation entries tell the linker that the reference to
sum
should be relocated using 32-bit PC-relative address, and the reference toarray
should be relocated using a 32-bit absolute address.- Relocating PC-relative references (at line 6):
- The corresponding relocation entry
r
consists of 4 fields: - These fields tell the linker to modify the 32-bit PC-relative reference starting at offset
0xf
so that it will point to thesum
routine at run-time. - Suppose that the linker has determined that:
-
- The linker first computes the run-time address of the reference (of symbol) (line 7 in fig 7.10):
-
- The linker thens updates the reference so that it will point to the
sum
routing at run-time (line 8):
- The linker thens updates the reference so that it will point to the
- In the resulting executable object file, the
call
instruction has the following relocated form: - At run time, the call instruction will be located at address
0x4004de
. When the CPU executes the call instruction, the PC has a value of0x4004e3
(0x4004de + 5 (5 bytes of e8 05 00 00 00 00)), which is the address following thecall
instruction (re-call from section 3.3). - So the next instruction address to execute is calculated:
0x4004e3
+0x05
(the second byte ofe8 05 00 00 00
) =0x4004e8
which is what we want.
- The corresponding relocation entry
- Relocating absolute address:
- The corresponding relocation entry
r
consists of 4 fields: - These fields tell the linker to modify the absolute reference starting at offset 0xa
so that it will point to the first byte of
array
at run time. - Suppose that the linker has determined that:
- The linker updates the reference:
- In the resulting executable object file, the reference has the following relocated form:
- The corresponding relocation entry
- Putting it all together, fig 7.12 shows the relocated
.text
and.data
sections in the final executable object file:
- Relocating PC-relative references (at line 6):