Andrea Meibos
CS 380
February 16, 1999

From High-Level Code to Program in Memory

Most programs today are first written in a high-level language, such as C++. Computers translate these commands into machine instructions that the CPU can execute using a compiler, an assembler, a linker, and finally a loader.

The compiler generates assembly code from high-level instructions. Each high-level language instruction has the equivalent of one or more assembly instructions. In addition, the compiler decides which variables to place in registers and which to put in memory at particular times in the program.

From this assembly code, the assembler creates an object file which contains the machine language instructions. Each assembly language command and its associated registers is translated into the binary equivalent. For example on a MIPS machine, add $t0, $s1, $s2 becomes (in decimal):

0

16

17

8

0

32

In addition, the assembler must create a symbol table that contains labels and their corresponding addresses for branches and memory transfer instructions. The object file contains the machine language code, the symbol table, hard-coded data, unresolved references to external code, and other data for debugging or relocation.

The main job of the linker is to take an object file and combine it with required library routines to form the machine language program. Machines use library routines so that commonly-used routines don't have to be recompiled each time one makes a change in the program code. Both the program object file and the library object file are placed in memory, and then the linker determines the previously unresolved addresses using each modules' symbol table. These addresses and all absolute references are patched to reflect the location in memory of the program.

Once the executable file has been created, the program still has to be in the correct spot in memory for the computer to execute it. This is the job of the loader. The loader first reads in the header of the executable to determine how much space is needed to allocate in memory for the program and associated data. The loader then copies the program and the data into this allocated spot in memory and places any parameters to the main function on the stack.

Next, the loader initializes the registers to zero, except for the stack pointer, which is initialized to the first free location on the stack. Lastly, the loader jumps to a start-up routine. The start-up routine copies the parameters from the stack to the argument registers and then calls the main routine of the program. After the main program finishes, the start-up routine exits with an exit system call.

Back to Main | Back to Schoolwork | E-mail me!