As a running example for this and the subsequent sections, we will use a simple interpreter for a calculator written in Java. If you don't have Java on your PC, you need to download Sun's J2SE SDK from http://java.sun.com/j2se/downloads.html.
On a Unix/MacOs system, you do:
tar xfz calc.tgz to extract the files
cd calc; ./build
./run
calc directory and execute build to compile the programs
run to run the calculator.
2*(3+8);, you can assign values to variables, such as
x:=3+4;, you can reference variables by name, such as
x+3;, you can define recursive functions interactively, such as
define f(n) = if n=0 then 1 else n*f(n-1);, and you can call them
using f(5);, etc. You exit using quit;.
The source files of the calculator example are given in
http://lambda.uta.edu/cse5317/calc/.
Some of these files will be explained
in detail later, but now we are ready to look at the scanner only, given
in calc.lex. The tokens, such
as sym.INT, are imported from the
parser
and will be explained
in the next section. For now, just assume that these are different constants
(integers). The order of lines is important in some cases: if we put
the rule for {ID} before the keywords, then the keywords will
never be recognized (this is the consequence of the rule priority
law). The lexical constructs that need to be skipped, such as
comments and whitespaces, should never return (we must return only
when we find a complete token).