Skip to content

A Tour of Computer Systems

Where do we begin ?

We begin our journey by tracing the lifetime of the hello program from the time it is created by a programmer, until it runs on a system, prints its value and terminates.

hello.c
#include <stdio.h>
int main() {
printf("Hello World\n");
return 0;
}

Compile the program and run

Terminal window
gcc -o hello hello.c
./hello
// Hello World

Information is Bits and Context

Our hello program begins as a source program (or source file).
The source program is a sequence of bits, organized in 8-bit chunks call bytes.
Each byte represents each character with a unique byte-size integer value.

The ASCII text representation of hello.c .

Programs are translated into different forms

The hello program then must be translated to a sequence of low-level machine-language instructions. These instructions are then packaged in a form called Executable object program (binary)

The ASCII text representation of hello.c .

  • Preprocessing phase: The #include <stdio.h> tells the preprocessor to read the contents of the system header in stdio.h and insert it directly into the program text. Result: another C program with the .i suffix
Terminal window
gcc -E hello.c -o hello.i
  • Compilation phase: the compiler translate the text file hello.i into the text file hello.s which contains an assembly program. This program includes the following definition of function main:
Terminal window
gcc -S hello.i -o hello.s
hello.s
main:
subq $8, %rsp
movl %.LCO, %edi
call puts
movl $0, %eax
addq $8, %rsp
ret
  • Assembly phase: the assembler translates hello.s into machine language instructions, packages them in a form known as relocatable object program, results in hello.o. This file is a binary file which contains the instructions for function main.
Terminal window
as -o hello.o hello.s
  • Linking phase: notice that our hello program calls the printf function which is part of the standard C lib. The printf function resides in a separate precompiled object file called printf.o, which must somehow be merged with our hello.o program. The linker (ld) handles this merging. The result is the hello file, which is an executable that is ready to be loaded into memory and executed by the system.
Terminal window
// The linker automatically includes standard C lib (printf.o) by default
gcc -o hello hello.o

Processors read and interpret instructions stored in memory

At this point, our hello.c program has been translated into a binary hello that is stored on disk.
To run this executable file on Unix, just simply write its name in the shell (bash or zsh)

Terminal window
./hello
Hello World!

The shell loads and runs the hello program, then waits for it to terminate.

To understand what happens to our hello program when we run it, we need to understand the hardware organization of a typical system,which is shown below.

Hardware organization .

  • Buses (System bus, Memory bus, I/O bus)**: running throughout the system is a collection of electrical conduits called buses that carry bytes of information back and forth between the components.
    • Buses are typically designed to transfer fixed-size chunks of bytes known as words .
    • The number of bytes in a word depends on the processor, but mostly 4 bytes (32 bit) or 8 bytes (64 bit) (You often see a term like 64-bit CPU)
  • I/O Devices (Mouse, Keyboard, Display and Disk): are the system’s connection to the external world.
    • Each I/O device is connected to the I/O bus by either a controller or an adapter (these devices are physically on the mainboard).
  • Main Memory: a temporary storage device that holds both a program and the data it manipulates while the processor is executing the program.
    • Physically, main memory = a collection of dynamic random access memory (DRAM)
    • Logically, memory is organized as a linear array of bytes, each with its own unique address starting at zero.
  • Processor: CPU, the engine that interprets (or executes) the instructions stored in the main memory, contains of:
    • Program counter or PC register (size: word-size, usually 32 bit or 64bit): this register always contains the address of the machine-language instruction being executed at the current time in main memory.
      • From the time that power is applied to the system until the time that the power is shut off, a processor repeatedly executes the instruction pointed at by the program counter and updates the program counter to point to the next instruction.
    • Register file: a small storage device that consists of a collection of word-size registers (like Stack Pointer, Frame Pointer).
    • Arithmetic/Logic Unit (ALU): computes new data and address values.

Given a simplified view of a system’s hardware organization, we can begin to understand what happens we run our program:

  • As we type the characters ./hello at the keyboard, the shell program reads each one into a register and then stores it in memory, as shown below Reading the hello command from the keyboard .
  • When we hit the enter key on the keyboard:
    • The shell uses system calls fork (to “ask” the OS) to create new process.

    • The OS saves the shell process’s context.

    • The OS loads the hello program and static data into the virtual memory space of the newly created process. (Paging and Swapping)

    • The OS creates a Process Control Block (a data structure) entry and adds to a process list to keep tracks of the processes

    • The OS must do some work (TODO) before running the process

      • Allocate memory (heap stack,…) for the process
      • I/O related tasks
    • By jumping to the main() (specialized mechanism), the OS transfers control of the CPU to the newly-created process, and thus program begins its execution. Loading executable from disk .

    • The processor begins executing the machine-language instructions in the hello program’s main routine. These instructions copy the bytes in the hello, world\n string from memory to the register file, and from there to the display device. Loading executable from disk .

    • The OS then restores the context of the shell process and passes control back to it, where it waits for the next command line input.

TODO: memory caching (caches)

TODO: network I/O