In the fifth part of the compiler project you will extend your existing compiler frontend (which builds the checked intermediate representation for SIMPLE programs in the form of the symbol table and the abstract syntax tree) with a backend that actually runs SIMPLE programs. Your interpreter first traverses the ST to build an environment which tracks the run-time value of all variables in a SIMPLE program; it then performs a post-order-style traversal of the AST and (using the environment as well as an auxiliary stack) executes the program one AST node at a time.
The interpreter also enforces the last few context conditions that could not be enforced during (static) semantic analysis because they depend on the actual run-time values computed by a program. You can get the abstract grammar for the SIMPLE programming language here. The context conditions for this assignment are given here.
The final compiler will consist of a number of modules and classes
working together to translate programs written in SIMPLE into equivalent
programs written in assembly language. While these “bits and pieces” are
spread out over the entire semester, you can already implement the basic
driver program that will orchestrate their work. The driver will be
sc and is invoked from the shell as follows:
Invocation = "./sc" ["-" ("s"|"c"|"t"|"a"|"i")] [filename] .
This describes the syntax of the command line in EBNF. After
itself, the user can supply one option (introduced by “
-”) to tell
the driver which parts of the compiler to run and what kind of output to
With this assignment the option
-i is allowed on the command line for
sc. The remaining options, including “no options at all,” should still
result in errors, except for
-a which are
unchanged from previous assignments.
For this assignment, the option
-i is supposed to traverse both the
symbol table and the abstract syntax tree to actually run a
This includes performing input and output (on standard input and standard
output) whenever the program executes a
WRITE instruction; no
other input or output should be required or performed, except for errors.
-i is given and an error is detected before the AST is completely
built, the interpreter should not run; if an error is detected while
the interpreter is running, the program should stop with an error message at
If a second argument is given, it is assumed to be the file name of a SIMPLE program to process. If no filename is given, you should read the program from standard input instead. Eventually this will also determine whether the output goes to standard output or to a file, but for now all your output goes to standard output.
The interpreter needs to traverse both the symbol table (ST) and the abstract syntax tree (AST), and you should apply the visitor design pattern for these tasks once more. Except for the brief notes below, you’re pretty much on your own for this assignment…
Interpreting a SIMPLE program requires that we “keep track” of the current values of all variables that were declared. This includes the elements of arrays as well as the fields of records. We need to “allocate” the necessary “storage” (which you can think of as “boxes” or something) for these variables, and we do so in a data structure called the environment.
Environments map names to storage in a way similar to the ST which maps names to meanings. However, environments do not include constants or types, both of which can occur in the ST: since their “values” cannot change during execution, we do not have to “keep track” of them.
First you need to decide how you will represent the storage (or “boxes”) necessary. Since environments are yet another example of a data structure where we have to handle various “kinds” of entries, the use of inheritance is appropriate once more.
You should write an abstract base class
Box and three derived classes:
IntegerBox to hold a single integer value,
ArrayBox to hold the
boxes that make up the value of an array, and
RecordBox to hold the
boxes that make up the value of a record.
IntegerBox you will need operations to
set the current
value; you should initialize the value to zero when an
RecordBox you will need operations to access one of
the boxes they are “made up of,” as well as operations that assign (in
the sense of a “deep copy”) one
RecordBox to another (to
support assignments between complete arrays or records as allowed in
An element inside an
ArrayBox is obviously selected using an integer
index; please make sure to check if the index is actually valid!
For the fields of a
RecordBox, however, the way to select a particular
field is not obvious at all.
One idea is to use the name of the field, however that does not fit the
AST we built for accessing record fields: we don’t store names but pointers
Variable nodes in the AST and
Variable objects in the ST in turn.
Resolving this “mismatch” is up to you…
To build the environment you have to traverse the ST, create instances
of the various
Box classes, and connect them as appropriate.
One possibility is to add an operation to the
Scope class that returns an
environment for its contents.
Another possibility is to implement the operation outside any class as
a function, but to make it a
friend of the relevant classes.
You could also develop a separate visitor that traverses the ST and
builds the environment.
The details are up to you again…
Once you have created the environment for a SIMPLE program, you have to
actually run it.
(Note that in SIMPLE the
BEGIN part of a program is optional; be
sure to handle this quirk!)
The process of interpreting the AST proceeds as a (mostly) post-order,
(mostly) left-to-right traversal, using an auxiliary stack to hold
Again you have a choice of using the visitor pattern for this traversal,
or of implementing it in the form of recursive functions that operate on
(parts of) the AST; we did the latter in the tiny example compiler I showed
you at the beginning of the semester.
Remember that you have to traverse a sequence of instructions in the order they appear in the program. For each AST node you encounter during the traversal you have to perform the appropriate actions. Numerous examples of this process were given in the lecture, but here are a few reminders:
Numbernodes you should simply push the appropriate value onto the stack.
Variablenodes you should push a pointer to the appropriate
Boxobject on the stack.
Fieldnodes should take the “box they work from” from the stack as well and push the “box they found” back onto it.
Boxon the stack while you evaluate an expression, you need to
getits current value (i.e. dereference it).
When you reach the end of an instruction, the stack should be empty once again (a nice “sanity check” for your code).
The advice from earlier assignments about using exceptions for error handling is still in effect, as is the required format for your error messages:
error: some helpful description
Enforcing context conditions for SIMPLE programs will lead to a number of “new” errors, for example when the index for an array access is out of range. If you followed the advice for error handling on previous assignments, you should have little trouble handling those new errors.
Input and output are almost trivial for this assignment. When you
READ instruction, you should read a single integer value
(followed by a newline) from standard input and store it in the
Box. When you interpret a
WRITE instruction, you should
write a single integer value to standard output (followed by a newline).
Consider the following SIMPLE program for example:
PROGRAM X; CONST ff = 42; BEGIN WRITE ff+5 END X.
If this program is stored in a file “
47.sim” then running the program
should work as follows:
$ ./sc -i 47.sim 47 $
Now consider the following program which simply “echos” the number entered by the user:
PROGRAM X; VAR x: INTEGER; BEGIN READ x; WRITE x END X.
Running the program should work as follows:
$ ./sc -i echo.sim 16 16 $ ./sc -i echo.sim 64738 64738 $
Here the first number was typed by the user whereas the second number is output by the interpreter. I hope these examples suffice…
If you are taking this course at the graduate level, the new run-time errors that are possible with this assignment should produce accurate position information just like your compile-time errors do. However, you do not keep running the program after a run-time error has occurred, that would just be silly.
Please follow the submission instructions as detailed on Piazza. Make sure that your tarball contains no derived files whatsoever (i.e. no executable files), but allows building all required derived files. Also make sure to include your name and email address in every file you turn in (well, in every file for which it makes sense to do so anyway)!
Regardless of your programming language of choice, we expect to build
your project using
make (if it needs building at all) and we expect to
run your project using
./sc (which stands for “SIMPLE compiler”).
You are free to use the standard library for your language of choice,
except for modules/classes that allow you to avoid writing large
parts of the code for an assignment; so no regular expressions, no parsing
Depending on your language of choice, compliance with certain tools
valgrind), compiler flags, or additional style
guides may also be required; see Piazza for details.
For reference, here is a short explanation of the grading criteria; not all of the criteria apply to all problems on a given assignment, and not all of the assignments even use all of the criteria.
Packaging refers to the proper organization of the stuff you hand in, following both the guidelines for Deliverables above as well as the general submission instructions for assignments on Piazza.
Style refers to programming style, including things like consistent indentation, appropriate identifier names, useful comments, suitable documentation, etc. Simple, clean, readable code is what you should be aiming for.
Design refers to proper modularization (into functions, classes, modules, etc.) and the proper choice of algorithms and data structures.
Performance refers to how fast/with how little memory your project can produce the required results compared to other submissions; in this course this can mean your actual compiler or interpreter as well as the code generated by it.
Functionality refers to your programs being able to do what they
should according to the specification given above.
(It also refers to you simply doing the required work, which may not be
If the specification is ambiguous, ask for clarification!
If no clarification is forthcoming, defend the choices you have made
If your project cannot be built, or if it is otherwise obvious that you
never tested it, you will get no points whatsoever.
If you project cannot be built without warnings using the required
compiler options we will take off 10%.
If your programs cannot be built using
make we will take off 10%.
valgrind detects memory errors in your programs, we will take off 10%.
If your project fails miserably even once, i.e. terminates with an
exception of any kind or dumps core, we will take off 10%.
Presumably you see the pattern here?