next up previous contents
Next: 8.2 Control Statements Up: 8 Intermediate Code Previous: 8 Intermediate Code   Contents

8.1 Translating Variables, Records, Arrays, and Strings

Local variables located in the stack are retrieved using an expression represented by the IR: MEM(+(TEMP(fp),CONST(offset))). If a variable is located in an outer static scope k levels lower than the current scope, the we retrieve the variable using the IR:

MEM(+(MEM(+(...MEM(+(MEM(+(TEMP(fp),CONST(static))),CONST(static))),...)),
      CONST(offset)))
where static is the offset of the static link. That is, we follow the static chain k times, and then we retrieve the variable using the offset of the variable.

An l-value is the result of an expression that can occur on the left of an assignment statement. eg, x[f(a,6)].y is an l-value. It denotes a location where we can store a value. It is basically constructed by deriving the IR of the value and then dropping the outermost MEM call. For example, if the value is MEM(+(TEMP(fp),CONST(offset))), then the l-value is +(TEMP(fp),CONST(offset)).

In Tiger (the language described in the textbook), vectors start from index 0 and each vector element is 4 bytes long (one word), which may represent an integer or a pointer to some value. To retrieve the ith element of an array a, we use MEM(+(A,*(I,4))), where A is the address of a (eg. A is +(TEMP(fp),CONST(34))) and I is the value of i (eg. MEM(+(TEMP(fp),CONST(26)))). But this is not sufficient. The IR should check whether a < size(a): CJUMP(gt,I,CONST(size_of_A),MEM(+(A,*(I,4))),NAME(error_label)), that is, if i is out of bounds, we should raise an exception.

For records, we need to know the byte offset of each field (record attribute) in the base record. Since every value is 4 bytes long, the ith field of a structure a can be retrieved using MEM(+(A,CONST(i*4))), where A is the address of a. Here i is always a constant since we know the field name. For example, suppose that i is located in the local frame with offset 24 and a is located in the immediate outer scope and has offset 40. Then the statement a[i+1].first := a[i].second+2 is translated into the IR:

MOVE(MEM(+(+(TEMP(fp),CONST(40)),
           *(+(MEM(+(TEMP(fp),CONST(24))),
              CONST(1)),
             CONST(4)))),
     +(MEM(+(+(+(TEMP(fp),CONST(40)),
               *(MEM(+(TEMP(fp),CONST(24))),
                 CONST(4))),
             CONST(4))),
       CONST(2)))
since the offset of first is 0 and the offset of second is 4.

In Tiger, strings of size n are allocated in the heap in n + 4 consecutive bytes, where the first 4 bytes contain the size of the string. The string is simply a pointer to the first byte. String literals are statically allocated. For example, the MIPS code:

ls:     .word        14
        .ascii       "example string"
binds the static variable ls into a string with 14 bytes. Other languages, such as C, store a string of size n into a dynamically allocated storage in the heap of size n + 1 bytes. The last byte has a null value to indicate the end of string. Then, you can allocate a string with address A of size n in the heap by adding n + 1 to the global pointer ($gp in MIPS):
MOVE(A,ESEQ(MOVE(TEMP(gp),+(TEMP(gp),CONST(n+1))),TEMP(gp)))


next up previous contents
Next: 8.2 Control Statements Up: 8 Intermediate Code Previous: 8 Intermediate Code   Contents
2015-01-20