An executable program generated by a compiler will have the following organization in memory on a typical architecture (such as on MIPS):
This is the layout in memory of an executable program. Note that in a virtual memory architecture (which is the case for any modern operating system), some parts of the memory layout may in fact be located on disk blocks and they are retrieved in memory by demand (lazily).
The machine code of the program is typically located at the
lowest part of the layout. Then, after the code, there is a section
to keep all the fixed size static data in the program. The dynamically
allocated data (ie. the data created using malloc
in C) as well
as the static data without a fixed size (such as arrays of variable
size) are created and kept in the heap. The heap grows from low to
high addresses. When you call malloc
in C to create a
dynamically allocated structure, the program tries to find an empty
place in the heap with sufficient space to insert the new data; if it
can't do that, it puts the data at the end of the heap and increases
the heap size.
The focus of this section is the stack in the memory layout. It is called the run-time stack. The stack, in contrast to the heap, grows in the opposite direction (upside-down): from high to low addresses, which is a bit counterintuitive. The stack is not only used to push the return address when a function is called, but it is also used for allocating some of the local variables of a function during the function call, as well as for some bookkeeping.
Lets consider the lifetime of a function call. When you call a function you not only want to access its parameters, but you may also want to access the variables local to the function. Worse, in a nested scoped system where nested function definitions are allowed, you may want to access the local variables of an enclosing function. In addition, when a function calls another function, we must forget about the variables of the caller function and work with the variables of the callee function and when we return from the callee, we want to switch back to the caller variables. That is, function calls behave in a stack-like manner. Consider for example the following program:
procedure P ( c: integer ) x: integer; procedure Q ( a, b: integer ) i, j: integer; begin x := x+a+j; end; begin Q(x,c); end;
At run-time, P
will execute the statement x := x+a+j
in
Q
. The variable a
in this statement comes as a
parameter to Q
, while j
is a local variable in Q
and x
is a local variable to P
. How do we organize the
runtime layout so that we will be able to access all these variables
at run-time? The answer is to use the run-time stack.
When we call a function f
, we push a new activation
record (also called a frame) on the run-time stack, which is
particular to the function f
. Each frame can occupy many
consecutive bytes in the stack and may not be of a fixed
size. When the callee function returns to the caller, the activation
record of the callee is popped out. For example, if the main program
calls function P
, P
calls E
, and E
calls
P
, then at the time of the second call to P
, there will
be 4 frames in the stack in the following order: main
,
P
, E
, P
Note that a callee should not make any assumptions about who is the
caller because there may be many different functions that call the
same function. The frame layout in the stack should reflect this.
When we execute a function, its frame is located on top of the stack.
The function does not have any knowledge about what the previous
frames contain. There two things that we need to do though when
executing a function: the first is that we should be able to pop-out
the current frame of the callee and restore the caller frame. This can
be done with the help of a pointer in the current frame, called the
dynamic link, that points to the previous frame (the caller's
frame). Thus all frames are linked together in the stack using
dynamic links. This is called the dynamic chain. When we pop
the callee from the stack, we simply set the stack pointer to be the
value of the dynamic link of the current frame. The second thing that
we need to do, is if we allow nested functions, we need to be able to
access the variables stored in previous activation records in the
stack. This is done with the help of the static link. Frames
linked together with static links form a static chain. The
static link of a function f
points to the latest frame in the
stack of the function that statically contains f
.
If f
is not lexically contained in any other function, its
static link is null. For example, in the previous program, if P
called Q
then the static link of Q
will point to the
latest frame of P
in the stack (which happened to be the
previous frame in our example). Note that we may have multiple frames
of P
in the stack; Q
will point to the latest. Also
notice that there is no way to call Q
if there is no P
frame in the stack, since Q
is hidden outside P
in the
program.
A typical organization of a frame is the following:
Before we create the frame for the callee, we need to allocate space
for the callee's arguments. These arguments belong to the caller's
frame, not to the callee's frame. There is a frame pointer (called
FP
) that points to the beginning of the frame. The stack
pointer points to the first available byte in the stack immediately
after the end of the current frame (the most recent frame). There are
many variations to this theme. Some compilers use displays to
store the static chain instead of using static links in the stack. A
display is a register block where each pointer points to a consecutive
frame in the static chain of the current frame. This allows a very
fast access to the variables of deeply nested functions. Another
important variation is to pass arguments to functions using registers.
When a function (the caller) calls another function (the callee), it executes the following code before (the pre-call) and after (the post-call) the function call:
We can classify the variables in a program into four categories:
Every frame-resident variable (ie. a local variable) can be
viewed as a pair of (level,offset). The variable level indicates the
lexical level in which this variable is defined. For example, the
variables inside the top-level functions (which is the case for all
functions in C) have level=1. If a function is nested inside a
top-level function, then its variables have level=2, etc. The offset
indicates the relative distance from the beginning of the frame that
we can find the value of a variable local to this frame. For example,
to access the nth argument of the frame in the above figure, we
retrieve the stack value at the address FP+1
, and to access the
first argument, we retrieve the stack value at the address FP+n
(assuming that each argument occupies one word). When we want to
access a variable whose level is different from the level of the
current frame (ie. a non-local variable), we subtract the level of the
variable from the current level to find out how many times we need to
follow the static link (ie. how deep we need to go in the static
chain) to find the frame for this variable. Then, after we locate the
frame of this variable, we use its offset to retrieve its value from
the stack. For example, to retrieve to value of x
in the
Q
frame, we follow the static link once (since the level of
x
is 1 and the level of Q
is 2) and we retrieve x
from the frame of P
.
Another thing to consider is what exactly do we pass as arguments to the callee. There are two common ways of passing arguments: by value and by reference. When we pass an argument by reference, we actually pass the address of this argument. This address may be an address in the stack, if the argument is a frame-resident variable. A third type of passing arguments, which is not used anymore but it was used in Algol, is call by name. In that case we create a thunk inside the caller's code that behaves like a function and contains the code that evaluates the argument passed by name. Every time the callee wants to access the call-by-name argument, it calls the thunk to compute the value for the argument.