13.2.1 Mark-and-Sweep Garbage Collection

Next: 13.2.2 Copying Garbage Collection Up: 13.2 Automatic Garbage Collection Previous: 13.2 Automatic Garbage Collection Contents

13.2.1 Mark-and-Sweep Garbage Collection

Here too we use the conservative approximation that if we can reach an object by following pointers from program variables, then the object is live (not garbage). These program variables are called roots and can be either frame-allocated or static. Therefore, the garbage collector needs to check the entire static section as well as all frames in the run-time stack for pointers to heap. It is common to use the following conservative approach: if a word has a value between the minimum and maximum address of the heap, then it is considered to be a pointer to the heap. An object is live if it is pointed by either a root or by a live object (this is a recursive definition). The garbage collector needs to start from each root and follow pointers recursively to mark all live objects.

A Mark-and-Sweep garbage collector has two phases:

Mark: starting from roots, mark all reachable objects by using a depth-first-search pointer traversal
Sweep: scan the heap from the beginning to the end and reclaim the unmarked objects (and unmark the marked objects).

The depth-first-search is done using the function

DFS ( p ) =
{ if *p record is unmarked
     then mark *p
  for each pointer p->fi of the record *p do DFS(p->fi)
}

which is called from every root:

for each p in roots
   DFS(p)

The mark phase is:

p = 'first object in the heap'
while p is in the heap do
{  if *p is marked
      then unmark *p
   else insert *p into the free list
   p = p+(size of record *p)
}

For example, consider the following objects in memory at the time the garbage collector takes over:

The roots can be static or frame-allocated variables. Let's assume that these objects are represented as follows in memory (as consecutive bytes):

1 6 e 3

2 0 i 4

3 0 d 5

4 0 g 0

5 0 a 0

6 8 b 7

7 10 k 0

8 0 c 0

9 0 j 10

10 0 f 0

11 0 h 12

12 11 l 0

Here, 0 are null pointers while the other pointers are memory addresses. That is, the free list is 0 (empty) and the roots are 1, 6, and 9.

After the mark-and-sweep garbage collection, the memory layout becomes:

1 6 e 3

2 0

3 0 d 5

4 2

5 0 a 0

6 8 b 7

7 10 k 0

8 0 c 0

9 0 j 10

10 0 f 0

11 4

12 11

where the free list points to the cell 12 now.

Next: 13.2.2 Copying Garbage Collection Up: 13.2 Automatic Garbage Collection Previous: 13.2 Automatic Garbage Collection Contents

2015-01-20