Static Linking

Static linking

TL;DR: Object files are merely collections of block of bytes. Some of these blocks contain program code, others contain program data , and others contain data structure that guide the linker and the loader. A linker concatenates blocks together, decides on run-time locations for the concatenates blocks, and modifies various locations within the code and data blocks.
Static linkers such as ld program take as input a collection of relocatable object files (*.o), command-line args and generate as output a fully linked executable object file that can be loaded and run.
The input relocatable object files consist of various code and data sections, where each section is a contiguous sequence of bytes
To build the executable, the linker must perform 2 main tasks:
- Step 1. Symbol resolution: Object files define and reference symbols, where each symbol corresponds to a function, a global variable, or a static variable (i.e., any C variable declared with the static attribute). The purpose of symbol resolution is to associate each symbol reference with exactly one symbol definition.
- Step 2. Relocation: Compilers and assemblers generate code and data sections that start at address 0. The linker relocates these sections by associating a memory location with each symbol definition, and then modifying all of the references to those symbols so that they point to this memory location. The linker blindly performs these relocations using detailed instructions, generated by the assembler, called relocation entries.

Symbol resolution

The linker resolves symbol references by associating each reference with exactly one symbol definition from the symbol tables of its input relocatable object files.
Symbol resolution is straightforward for references to local symbols that are defined in the same module as the reference.
Symbol resolution for global symbols is trickier since multiple object modules might define global symbol with the same name

How Linkers Resolve Duplicate Symbol Names

At compile time, the compiler exports each global symbol to the assembler as either strong or weak and the assembler encodes this information implicitly in the symbol table of the relocatable object file.
- Weak symbols: uninitialized global variables.
- Strong symbols: functions and initialized global variables.
Linux linkers use the following rules:
- Rule 1: Multiple strong symbols with the same name are not allowed.
- Rule 2: Given a strong symbol and multiple weak symbols with the same name, choose the strong symbol.
- Rule 3: Given multiple weak symbols with the same name, choose any of the weak symbols.
Example:
foo1.c
```
int main() {
    return 0;
}
```
bar1.c
```
int main() {
    return 0;
}
```
- In this case, linker will generate an error because the strong symbol main is defined multiple times (rule 1)
Example 2:
foo2.c
```
int x = 15213;

int  main() {
    return 0;
}
```
bar2.c
```
int x = 15123;

void f() {
}
```
- In this case, same error because the strong symbol x is defined multiple times
Example 3:
foo3.c
```
#include <stdio.h>
void f(void);

int x = 15213;

int main(){
    f();
    printf("x = %d\n", x);
    return 0;
}
```
bar3.c
```
int x;
void f(){
    x = 15212;
}
```
- If x is uninitialized in 1 module, then the linker will quietly choose the strong symbol defined in the other (rule 2). In this case, x in foo3.c is chosen.
- At run-time, function f changes the value of x from 15213 to 15212.

Linking with Static Libraries

In practice, all compilation systems provide a mechanism for packaging related object modules into a single file called a static library, which can then be supplied as input to the linker.
When it builds the output executable, the linker copies only the object modules in the library that are referenced by the application program.
For example, a program that uses functions from the C standard library and the math library could e compiled and linked with the following command:
Terminal window
```
gcc main.c /usr/lib/libm.a /usr/lib/libc.a
```
- In fact, C compiler drivers always pass libc.a to the linker.
On linux, static libraries are stored on disk in a particular file format know as an archive (.a suffix). An archive is a collection of concatenated relocatable object files, with a header that describes the size and location of each member object file.

Consider the following example:

int addcnt = 0;

void addvec(int *x, int *y, int *z, int n){
    int i;

    addcnt++;

    for (i = 0; i < n; i++)
        z[i] = x[i] = y[i];
}

int multcnt = 0;

void multvec(int *x, int *y, int *z, int n){
    int i;

    multcnt++;

    for (i = 0; i < 5; i++){
        z[i] = x[i] * y[i];
    }
}

extern void multvec(int *x, int *y, int *z, int n);
extern void addvec(int *x, int *y, int *z, int n);

#include <stdio.h>
#include "vector.h"

int x[2] = {1,2};
int y[2] = {3,4};
int z[2];

int main(){
    addvec(x,y,z,2);
    printf("z = [%d %d]\n", z[0], z[1]);
    return 0;
}

First, we create a static library of 2 functions:

gcc -c addvec.c multvec.c
ar rcs libvector.a addvec.o multvec.o

Then we compile and link the input files main2.o and libvector.a:

gcc -c main2.c
gcc -static -o prog2c main2.o ./libvector.a

The -static argument => linker should build a fully linked executable object file that can be loaded into memory and run without any furthur linking at load time.
When the linker runs, it determines that the addvec symbol defined by addvec.o is referenced by main2.o so it copies addvec.o into the executable.
Since the program doesn’t reference any symbols defined by multvec.o, the linker does not copy this module into the executable. The linker also copies the printf.o module from libc.a, along with a number of other modules from the C run-time system.
Figure 7.8 summariezs the activity of the linker:

How linkers use static libraries to resolve references

During the symbol resolution phase, the linker scans the relocatable object files and archives left to right in the same sequential order that they appear on the compiler driver’s command line. (The driver automatically translates any .c files on the command line into .o files.)
During this scan, the linker maintains
- A set E of relocatable object files that will be merged to form the executable
- A set U of unresolved symbols (i.e., symbols referred to but not yet defined)
- A set D of symbols that have been defined in previous input files.
- Initially, E, U, and D are empty.
- For each input file f on the command line, the linker determines if f is an object file or an archive. If f is an object file, the linker adds f to E, updates U and D to reflect the symbol definitions and references in f , and proceeds to the next input file
- If f is an archive, the linker attempts to match the unresolved symbols in U against the symbols defined by the members of the archive. If some archive member m defines a symbol that resolves a reference in U, then m is added to E, and the linker updates U and D to reflect the symbol definitions and references in m. This process iterates over the member object files in the archive until a fixed point is reached where U and D no longer change. At this point, any member object files not contained in E are simply discarded and the linker proceeds to the next input file.
- If U is nonempty when the linker finishes scanning the input files on the command line, it prints an error and terminates. Otherwise, it merges and relocates the object files in E to build the output executable file.
However, the downside is If the library that defines a symbol appears on the command line before the object file that references that symbol, then the reference will not be resolved and linking will fail
Terminal window
```
gcc -static ./libvector.a main2.c
# This will fail
```
- When libvector.a is processed, U is empty, so no member object files from libvector.a are added to E. Thus, the reference to addvec is never resolved and the linker emits an error message and terminates.
The general rule for libraries is to place them at the end of the command line. If the members of the different libraries are independent, in that no member references a symbol defined by another member, then the libraries can be placed at the end of the command line in any order
If, on the other hand, the libraries are not independent, then they must be ordered so that for each symbol s that is referenced externally by a member of an archive, at least one definition of s follows a reference to s on the command line
Example: suppose foo.c calls functions in libx.a and libz.a that call functions in liby.a:
Terminal window
```
gcc foo.c libx.a libz.a liby.a
```
Example 2: suppose foo.c calls a function in libx.a that calls a function in liby.a that calls a function in libx.a. Then libx.a must be repeated
Terminal window
```
gcc foo.c libx.a liby.a libx.a
```
Example 3: p.o → libx.a → liby.a and liby.a → libx.a → p.o
Terminal window
```
gcc -o p.o libx.a liby.a libx.a
```

Relocation

Once the linker has completed the symbol resolution step, it has associated each symbol reference in the code with exactly one symbol definition (i.e., a symbol table entry in one of its input object modules). At this point, the linker knows the exact sizes of the code and data sections in its input object modules. It is now ready to begin the relocation step, where it merges the input modules and assigns run-time addresses to each symbol
Relocation consists of 2 steps:
- 1.Relocating sections and symbol definitions: the linker merges all sections of the same type into a new aggregate section of the same type. For example, the .data sections from the input modules are all merged into one section that will become the .data section for the output executable object file. The linker then assigns run-time memory addresses to the new aggregate sections, to each section defined by the input modules, and to each symbol defined by the input modules. When this step is complete, each instruction and global variable in the program has a unique run-time memory address.
- 2.Relocating symbol references within sections: , the linker modifies every symbol reference in the bodies of the code and data sections so that they point to the correct run-time addresses. To perform this step, the linker relies on data structures in the relocatable object modules known as relocation entries.

Relocation entries

When an assembler generates an object module, it does not know where the code and data will ultimately be stored in memory.
Whenever the assembler encounters a reference to an object whose ultimate location is unknown, it generates a relocation entry that tells the linker how to modify the reference when it merges the object file into an executable.
Relocation entries for code are placed in .rel.text.
Relocation entries for data are placed in .rel.data.
Figure 7.9 shows the format of an ELF relocation entry:
- offset: section offset of the reference that will need to be modified
- symbol: identifies the symbol that modified reference should point to
- addend: a signed constant that is used by some types of relocations to bias the value of the modified reference
- type: ELF defines 32 different relocation types, here we’re concerned with only the 2 most basic types
  - R_X86_64_PC32: Relocate a reference that uses a 32-bit PC-relative address (recall from Machine-level Representation Control section)
  - R_X86_64_32: relocate a reference that uses a 32-bit absolute address (recall from Machine-level Representation Control section)
  - => These 2 type support the x86-64 small code model (which assumes the total size of the code and data is smaller than 2 GB)

Relocating symbol references

Linker’s relocation algorithm
- Assume that each section s is an array of bytes, each relocation entry r is a struct of type Elf64_Rela (defined in Figure 7.9).
- Assume that when the algorithm runs, the linker has already chosen runtime addresses for each section (ex: .data, .text, …) (denoted ADDR(s)) and each symbol (denoted ADDR(r.symbol))
Example: consider the C program from figure 7.1
- Shows the disassembled code from main.o by the objdump tool:
  Terminal window
```
objdump -dx main.o
```
Instructions generated:
The main function references two 2 global symbols, sum and array. For each reference, the assembly has generated a relocation entry (stored in rel.text or rel.data)

Relocation entries and instructions are actually stored in different sections of the object file. The objdump tool displays them together for convenience.
The relocation entries tell the linker that the reference to sum should be relocated using 32-bit PC-relative address, and the reference to array should be relocated using a 32-bit absolute address.
- Relocating PC-relative references (at line 6):
  - The corresponding relocation entry r consists of 4 fields:
    r.offset = 0xf r.symbol = sum r.type = R_X86_64_PC32 r.addend = -4
  - These fields tell the linker to modify the 32-bit PC-relative reference starting at offset 0xf so that it will point to the sum routine at run-time.
  - Suppose that the linker has determined that:
    ADDR(s) = ADDR(.text) = 0x4004d0 ADDR(r.symbol) = ADDR(sum) = 0x4004e8
  - 1. The linker first computes the run-time address of the reference (of symbol) (line 7 in fig 7.10):
    refaddr = ADDR(s) + r.offset = 0x4004d0 + 0xf = 0x4004df
  - 1. The linker thens updates the reference so that it will point to the sum routing at run-time (line 8):
    *refptr = (unsigned) (ADDR(r.symbol) + r.addend - refaddr) = (unsigned) (0x4004e8 + (-4) - 0x4004df) = (unsigned) (0x5)
  - In the resulting executable object file, the call instruction has the following relocated form:
    4004de: e8 05 00 00 00 callq 4004e8 <sum> sum() 4004e3
  - At run time, the call instruction will be located at address 0x4004de. When the CPU executes the call instruction, the PC has a value of 0x4004e3 (0x4004de + 5 (5 bytes of e8 05 00 00 00 00)), which is the address following the call instruction (re-call from section 3.3).
  - So the next instruction address to execute is calculated: 0x4004e3 + 0x05 (the second byte of e8 05 00 00 00) = 0x4004e8 which is what we want.
- Relocating absolute address:
  - The corresponding relocation entry r consists of 4 fields:
    r.offset = 0xa r.symbol = array r.type = R_X86_64_32 r.addend = 0
  - These fields tell the linker to modify the absolute reference starting at offset 0xa so that it will point to the first byte of array at run time.
  - Suppose that the linker has determined that:
    ADDR(r.symbol) = ADDR(array) = 0x601018
  - The linker updates the reference:
    *refptr = (unsigned) (ADDR(r.symbol) + r.addend) = (unsigned) (0x601018 + 0) = (unsigned) (0x601018)
  - In the resulting executable object file, the reference has the following relocated form:
    4004d9: bf 18 10 60 00 mov $0x601018, %edi // %edi = &array
- Putting it all together, fig 7.12 shows the relocated .text and .data sections in the final executable object file: