Next: 8.1 Translating Variables, Records,
Up: CSE 5317/4305: Design and
Previous: 7.2 Case Study: Activation
The semantic phase of a compiler first translates parse trees into an
intermediate representation (IR), which is independent of the
underlying computer architecture, and then generates machine code from
the IRs. This makes the task of retargeting the compiler to another
computer architecture easier to handle.
We will consider Tiger for a case study. For simplicity, all data are
one word long (four bytes) in Tiger. For example, strings, arrays,
and records, are stored in the heap and they are represented by one
pointer (one word) that points to their heap location. We also assume
that there is an infinite number of temporary registers (we will see
later how to spill-out some temporary registers into the stack
frames). The IR trees for Tiger are used for building expressions and
statements. They are described in detail on pages 152 and 153 in the
textbook. Here is a brief description of the meaning of the expression IRs:
- CONST(i): the integer constant i.
- MEM(e): if e is an expression that calculates a memory address, then this is the contents
of the memory at address e (one word).
- NAME(n): the address that corresponds to the label n, eg. MEM(NAME(x)) returns the value stored at the location X.
- TEMP(t): if t is a temporary register, return the value of the register, eg. MEM(BINOP(PLUS,TEMP(fp),CONST(24))),
fetches a word from the stack located 24 bytes above the frame pointer.
- BINOP(op,e1,e2): evaluate e1, then evaluate e2, and finally perform the binary operation op over
the results of the evaluations of e1 and e2. op can be PLUS, AND, etc.
We abbreviate BINOP(PLUS,e1,e2) as +(e1,e2).
- CALL(f,[e1,e2,...,en]): evaluate the expressions e1, e2, etc (in that order), and at the
end call the function f over these n parameters. eg. CALL(NAME(g),ExpList(MEM(NAME(a)),ExpList(CONST(1),NULL)))
represents the function call g(a,1).
- ESEQ(s,e): execute statement s and then evaluate and return the value of the expression e
And here is the description of the statement IRs:
- MOVE(TEMP(t),e): store the value of the expression e into the register t.
- MOVE(MEM(e1),e2): evaluate e1 to get an address, then evaluate e2, and then store the value of e2
in the address calculated from e1. eg, MOVE(MEM(+(NAME(x),CONST(16))),CONST(1)) computes x[4] := 1
(since 4*4bytes = 16 bytes).
- EXP(e): evaluates e and discards the result.
- JUMP(L): Jump to the address L (which must be defined in the program by some LABEL(L)).
- CJUMP(o,e1,e2,t,f): evaluate e1, then e2. The binary relational operator o must be EQ, NE, LT etc.
If the values of e1 and e2 are related by o, then jump to the address calculated by t, else jump the one for f.
- SEQ(s1,s2,...,sn): perform statement s1, s2, ... sn is sequence.
- LABEL(n): define the name n to be the address of this statement. You can retrieve this address using NAME(n).
In addition, when JUMP and CJUMP IRs are first constructed, the labels
are not known in advance but they will be known later when they are
defined. So the JUMP and CJUMP labels are first set to NULL and then
later, when the labels are defined, the NULL values are changed to
real addresses.
Subsections
Next: 8.1 Translating Variables, Records,
Up: CSE 5317/4305: Design and
Previous: 7.2 Case Study: Activation
Leonidas Fegaras
2002-08-26