next up previous contents
Next: 4.3 Gen: A Java Up: 4 Abstract Syntax Previous: 4.1 Building Abstract Syntax   Contents

4.2 Building Abstract Syntax Trees in C

The previous example of expression ASTs can be written as follows in C:

typedef struct Exp {
  enum { int_exp, true_exp, false_exp, variable_exp,
         binary_op, unary_op, function_call,
         record_construction, projection } tag;
  union { int                                      integer;
          string                                   variable;
          struct { string           oper;
                   struct Exp*      left;
                   struct Exp*      right; }       binary;
          struct { string           oper;
                   struct Exp*      uexp; }        unary;
          struct { string           name;
                   struct Exp_list* arguments; }   call;
          struct rec { string       attribute;
                       struct Exp*  value;
                       struct rec*  next; }        record;
          struct { struct Exp*  value;
                   string attribute; }             project;
      } op;
} ast;
where Exp_list is a list of ASTs:
typedef struct Exp_list { 
  ast*             elem;
  struct Exp_list* next;
} ast_list;
It's a good idea to define a constructor for every kind of expression to simplify the task of constructing ASTs:
ast* make_binary_op ( string oper, ast* left, ast* right ) {
  ast* e = (ast*) malloc(sizeof(ast));
  e->tag = binary_op;
  e->op.binary.oper = make_string(oper);
  e->op.binary.left = left;
  e->op.binary.right = right;
  return e;
};
For example,
make_binary_op("+",make_binary_op("-",make_variable("x"),make_integer(2)),
               make_integer(3))
constructs the AST for the input (x-2)+3.

Unfortunately, when constructing a compiler, we need to define many tree-like data structures to capture ASTs for many different constructs, such as expressions, statements, declarations, programs etc, as well as type structures, intermediate representation (IR) trees, etc. This would require hundreds of recursive structs in C or classes in Java. An alternative method is to use just one generic tree structure to capture all possible tree structures. This is exactly what we did for our calculator example and is described in detail below.



fegaras 2012-01-10