Annotated Abstract Grammar for SIMPLE

The abstract grammar for SIMPLE is given in a variant Extended Backus-Naur Form (EBNF). Note that this grammar is not suitable for parsing SIMPLE programs, instead it is a description of the possible abstract syntax trees (ASTs) we can generate.

Parts in bold indicate where nodes of the AST point to usable symbol table (ST) objects, namely constants and variables. Note that ST objects are not necessarily only those declared explicitly, e.g. for literal numbers.

Nodes that require type information to enforce context conditions should also point to the relevant types in the symbol table. Again, ST types are not necessarily only those with an explicit declaration, e.g. for selector cascades.

Instructions = Instruction | Instruction Instructions .

Models a list of at least one instruction; the order of instructions in the input program matters and must be preserved.

Instruction = Assign | If | Repeat | Read | Write .

There are five kinds of instructions: assignment, if, repeat, read, and write. There are no while instructions! Each while in the input program is transformed into a repeat nested inside an if instead.

Assign = Location Expression .
Read = Location .

The destination of an assignment or read must be a writable location in memory, which can be one of three things:

Location = Variable | Index | Field .
Index = Location Expression .
Field = Location Variable .

A location is either a variable, an element of an array variable indexed by an expression (think of “lots of anonymous” variables), or a field within a record variable, which is itself a variable. The production for fields might be confusing at first: The point is that we need to keep track of the record variable that was selected from as well as the field variable that was selected within the record; but whereas the record part can be produced by preceeding selectors, the field part cannot (see concrete grammar).

If = Condition Instructions_true [Instructions_false] .
Repeat = Condition Instructions .
Write = Expression .

The instructions for the else part of an if can be empty.

Expression = Number | Location | Binary .
Binary = Operator Expression_left Expression_right .

Inside expressions, locations are not used as destinations but as sources of values. The operator in a binary expression can be +, -, *, DIV, or MOD.

Condition = Relation Expression_left Expression_right .

The relation in a condition can be =, #, <, >, <=, or >=.

Notes on AST Construction