As a running example for this and the subsequent sections, we will construct a simple interpreter for a calculator. It is very important that you download the code yourself on gamma and play with it:
gzip -d -c calc.tar.gz | tar -xvf -
cd calc; make
./calc
gzip -d -c gc.tar.gz | tar -xvf -
cd gc and change the first lines of the Makefile to CC=gcc, CXX=g++, and
CFLAGS= -O -DNO_SIGNALS -DALL_INTERIOR_POINTERS -DSILENT -DATOMIC_UNCOLLECTABLE -DREDIRECT_MALLOC=GC_malloc_uncollectableand execute:
make; make c++ to create the C and C++ garbage collectors.
GCDIR = /public/cse/5317-501/gc/ in calc/Makefile to point to your own gc directory
and then proceed with the instalation of calc.
All the files of the calculator example are given in the
calc
directory. They will be explained
in detail later. For now, we are ready to look at the scanner, given
in calc.lex. The upercase
constants, such as INT, are defined as tokens in the
parser
and will be explained
later. For now, just assume that these are different constants
(integers). The order of lines is important in some cases: if we put
the rule for {ID} before the keywords, then the keywords will
never be recognized (this is the consequence of the rule priority
law). The lexical constructs that need to be skipped, such as
comments and whitespaces, should never return (we must return only
when we find a complete token).