600.328 / 428 / 628: Compilers and Interpreters
You’re in the right place if you want to find out how compilers and
interpreters, the tools you’ve been using for quite a while now to do
all your programming with, really work. You’ll also pick up some useful
software development techniques along the way.
Catalog Description: Introduction to compiler design, including
lexical analysis, parsing, syntax-directed translation, symbol tables,
run-time environments, and code generation and optimization. Students
are required to write a complete compiler as a course project.
Prerequisite(s): 600.120: Intermediate Programming, 600.226: Data
Structures. (Having taken 600.233: Computer System Fundamentals and
600.271: Automata and Computation Theory is very helpful too.)
The course includes significant programming projects; without prior
development experience you’ll probably get lost in a
maze of relatively complex code.
Please read the
general course policies
and take them to heart.
Additional policies specific to this course may be posted at a later date.
- Understand the theoretical foundations for compilers
- Practice object-oriented design and the application of
- Implement a full compiler for a high-level imperative language.
- Understand the hardware / software tradeoffs involved in
There is no required text.
However, it is strongly recommended that you get yourself a text book
The following are all excellent, but none covers exactly what we’ll do in
the course, and all cover things that we’ll never even mention.
Note that most editions will do, you don’t necessarily need the latest and
therefore most expensive one.
Compilers and Interpreters
Software Development and Design
- Assignments (about 10): 60%
- Midterm: 15%
- Final: 25%
Please check the individual assignments for due dates and the structure your
solutions should have.
See the course policies
for detailed submission instructions.
If you have an opinion on these assignments, be it good or bad, please
know about it. We’re always trying to make these things more enjoyable
(if that’s an applicable term? :-).
This is not a schedule.
It’s a “log” of what we did, roughly, in each lecture.
Don’t expect it to turn into a schedule, it won’t.
Also there will eventually be gaps, sorry.
- January 29: Welcome,
- January 31: A tiny interpreter / compiler for integer expressions;
demo and overview;
vector and string abstractions to make C more palatable;
details on tokens, scanner, abstract syntax tree (nodes), parser;
outline of grammars (abstract grammar for AST, concrete grammar
- February 2: A tiny interpreter / compiler for integer expressions;
grammars again, abstract and concrete;
details on parser, interpreter, code generator for MIPS/SPIM.
- February 5: Compiler qualities;
Compiler architecture, phases, passes;
bootstrapping and porting (T-diagrams, self-compilation test,
- February 7: Formal languages review;
notations for grammars (EBNF in particular);
intro to lexical analysis;
regular languages and grammars;
- February 9: Scanner implementation; working from the EBNF;
token representations; interface between scanner and parser.
- February 12: Intro to syntactic analysis;
context-free languages and grammars;
top-down versus bottom-up parsing;
left recursion elimination for top-down parsing;
recursive descent parsing;
- February 14: Parser implementation;
working from the EBNF;
intro to error handling;
panic mode: stop at the first error;
synchronizing parser and input for recovery;
“It’s full of heuristics!”;
first and follow sets;
why error handling based on follow sets is complicated;
using weak and strong tokens/symbols;
missing weak symbols don’t require synchronization
(neither do some common “wrong use” errors like “=” versus “:=“);
after a non-weak error, resynchronize at a strong symbol;
filtering excessive error messages.