Here too we use the conservative approximation that if we can reach an object by following pointers from program variables, then the object is live (not garbage). These program variables are called roots and can be either frame-allocated or static. Therefore, the garbage collector needs to check the entire static section as well as all frames in the run-time stack for pointers to heap. It is common to use the following conservative approach: if a word has a value between the minimum and maximum address of the heap, then it is considered to be a pointer to the heap. An object is live if it is pointed by either a root or by a live object (this is a recursive definition). The garbage collector needs to start from each root and follow pointers recursively to mark all live objects.
A Mark-and-Sweep garbage collector has two phases:
The depth-first-search is done using the function
DFS ( p ) = { if *p record is unmarked then mark *p for each pointer p->fi of the record *p do DFS(p->fi) }
which is called from every root:
for each p in roots DFS(p)
The mark phase is:
p = 'first object in the heap' while p is in the heap do { if *p is marked then unmark *p else insert *p into the free list p = p+(size of record *p) }
For example, consider the following objects in memory at the time the garbage collector takes over:
The roots can be static or frame-allocated variables. Let's assume that these objects are represented as follows in memory (as consecutive bytes):
1 | 6 | e | 3 |
2 | 0 | i | 4 |
3 | 0 | d | 5 |
4 | 0 | g | 0 |
5 | 0 | a | 0 |
6 | 8 | b | 7 |
7 | 10 | k | 0 |
8 | 0 | c | 0 |
9 | 0 | j | 10 |
10 | 0 | f | 0 |
11 | 0 | h | 12 |
12 | 11 | l | 0 |
After the mark-and-sweep garbage collection, the memory layout becomes:
1 | 6 | e | 3 |
2 | 0 | ||
3 | 0 | d | 5 |
4 | 2 | ||
5 | 0 | a | 0 |
6 | 8 | b | 7 |
7 | 10 | k | 0 |
8 | 0 | c | 0 |
9 | 0 | j | 10 |
10 | 0 | f | 0 |
11 | 4 | ||
12 | 11 |