Operational Semantics

Goals: Means:
Other forms of language semantics we will not cover These are also interesting viewpoints but we don't have the time to cover them.

Operational Semantics

Goal: define how programs compute/evaluate/execute.
Here is an example of a computation (imagine the tree):
(node A): (Function x -> x + 2) (3 + 2 + 5) ==> 12  because
(node B, child of A):   3 + 2 + 5 ==> 10, because
(node C, child of B):      3 + 2 ==> 5, and
(node B, again):        5 + 5 ==> 10; and then,
(node E, child of A):   10 + 2 ==> 12.
... In general, to compute a function application,
  1. Compute the argument to a value
  2. Compute the body of the function with the argument textually substituted, to a value.
So, to compute this application there are two sub-computations performed, which are subtrees.

Definition. An operational semantics for a programming langauge is a mathematical definition of its computation relation, e ==> v, where e is a program in the language.

Behind every language you have ever programmed is an operational semantics, but it is usually described informally, in English.
Operational semantics may be given to just about any kind of language behavior, but the rules do get more complicated.

Operational Semantics for Logic Expressions

Lets warm up with something very simple: boolean logic without variables.

Definition.The boolean logic expressions e consist of values True and False, and expressions e And e, e Or e, Not e, and e Implies e.

type boolexp = True | False | Not of boolexp | And of boolexp *
boolexp | Or of boolexp * boolexp | Implies of boolexp * boolexp
(Note: We are going to use Capitalized keywords in all of our little language syntax to avoid potential conflicts with e.g. Caml code.)

Definition The operational semantics for boolean logic is defined as the least relation ==> satisfying the following rules.
True rule
True ==> True

False rule
False ==> False

Not rule
e ==> v
Not e ==> the negation of v

And rule
e1 ==> v1, e2 ==> v2
e1 And e2 ==> the logical and of v1 and v2

Or, Implies rules: should be clear from above.

Comments on the rules
An example.

Not(Not(False)) And True ==> False, because by the And rule,

True ==> True, and Not(Not(False)) ==> False, the latter because

Not(False) ==> True, because

False ==> False.

This computation is a tree because there are two subcomputations necessary for each binary operator.

Question: Why in the above definition does it state that ==> is the "least" relation satisfying the rules?
Answer: "least" here means fewest pairs related. If we did not state this requirement, then a relation which related anything to anything else would also be a relation satisfying all the rules (think about it).

Provable Properties of Operational Semantics

The great thing about operational semantics is we can actually prove some properties about execution.

Lemma. The boolean language is deterministic: if e ==> v and e ==> v', then v = v'.

Proof. By induction on the height of the proof tree.

Lemma. The boolean language is normalizing: For all boolean expressions e, there is some value v where e ==> v.

Proof. By induction on the size of e.

Question: Suppose we left off the True rule by mistake; what nice property would fail?

Operational Semantics and Interpreters

There is a very close relationship between an operational semantics and an actual interpreter written in Caml.

Given an operational semantics defined via relation ==>, there is a corresponding (Caml) evaluator function eval.

Note, the Caml function eval takes the program e as argument in the form of its syntax tree, for instance Plus(Int(1),Times(Int(2),Int(3))).

Definition. A (Caml) interpreter function eval faithfully implements an operational semantics e ==> v if

e ==> v if and only if eval(e) returns result v.

Implementing an operational semantics

The above rules induce a Caml interpreter function eval as follows.
let rec eval exp = 
  match exp with 
    True -> True
  | False -> False
  | Not(exp0) -> (match eval exp0 with
      True -> False
    | False -> True)
  | And(exp0,exp1) -> (match (eval exp0, eval exp1) with
      (True,True) -> True
    | (_,False) -> False
    | (False,_) -> False)

  | Or(exp0,exp1) -> (match (eval exp0, eval exp1) with
      (False,False) -> False
    | (_,True) -> True
    | (True,_) -> True)

  | Implies(exp0,exp1) -> (match (eval exp0, eval exp1) with
      (False,_) -> True
    | (True,True) -> True
    | (True,False) -> False)
The only difference between the operational semantics and evaluator is the evaluator is a function, so we start with the bottom-left expression in a rule, use the evaluator to recursively produce the value(s) above the line in the rule, and finally compute and return the value below the line in the rule.
Fact. The boolean language interpreter above faithfully implements its operational semantics: e ==> v if and only if eval(e) returns v as result.

We will go back and forth between these two forms during the course. The operational semantics form is used because it is independent of any particular programming language. The evaluator form is good because you can test your evaluator on real code.

Question: Why not just use interpreters and forget about the operational semantics approach?
Answer: Then the whole exercise is circular, since we don't really know what the Caml compiler is doing. Operational semantics provides a foundation free of any particular language.

Definition. A metacircular interpreter is an interpreter for (possibly a subset of) language X that is written in language X. Metacircular interpreters give you some idea of how a language works, but suffer from the above non-foundational problems. A metacircular interpreter for Lisp is a classic programming language theory exercise.

The D Language

We now study our first programming language, D.

D is a "Diminutive" pure functional programming language.

D and the lambda calculus

The lambda calculus is an even simpler language with only functions, which are written lambda x.e instead of Function x -> e, where lambda is the Greek lowercase character. It is the original higher-order functional language, and dates from the 40's (!). More later on this.

Turing Completeness of D

D is still Turing complete: every partial recursive function on numbers can be written in D. In fact, its even Turing-complete without the numbers or booleans (the pure lambda-calculus). No (deterministic) programming language can compute more than the partial recursive functions.

Definition The expressions, e, of the D language are inductively defined as the least set including

  1. variables x,
  2. (anonymous) functions Function x -> e and function application e e,
  3. Recursive functions Let Rec f x = e
  4. numbers 0, 1, -1, 2, -2, ... and numerical operations + - = ,
  5. booleans True, False and boolean operations And, Or, Not,
  6. and conditional If e Then e Else e.
The value expressions of D are Note, the metavariables we are using include e meaning an arbitrary D expression, v meaning an arbitrary expression that is a value, and x meaning an expression which is a variable.

The Caml variant type for D syntax is as follows.
type ident = Ident of string

type expr = 
 Var of ident | Function of ident * expr | Appl of expr * expr |
 Letrec of ident * ident * expr |
 Plus of expr * expr | Minus of expr * expr | Equal of expr * expr | 
 And of expr * expr| Or of expr * expr | Not of expr |  
 If of expr * expr * expr | Int of int | Bool of bool 
Abstract and concrete syntax We will glibly switch back and forth between the concrete and abstract syntax: if we are talking relative to Caml the abstract syntax will be used, and outside of Caml we will use the concrete syntax.

Higher-order functions in D The main feature of D is higher-order functions, which also introduces variables. Recall that programs are computed by rewriting them.
(Function x -> x + 2) (3 + 2 + 5) ==> 12  because
  3 + 2 + 5 ==> 10, because
    3 + 2 ==> 5, and
    5 + 5 ==> 10; and then,
  10 + 2 ==> 12.
Note how in this example, the argument is substituted for the variable in the body -- this gives us a rewriting interpreter.

Variable Substitution

To do this right, we need to define the concepts of an occurrence of a variable, a bound occurrence, a free occurrence, a binding, a closed expression, and substitution.


  1. A variable use x occurs in e if x appears somewhere in e. Note we refer to variable uses only, not definitions.
  2. Any occurrences of variable x in Function x -> e are bound; any free occurrences of x in e here are bound occurrences in Function x -> e. Similarly, occurrences of f and x are bound in Let Rec f x = e.
  3. A variable x occurs free in e if it has an occurrence in e which is not a bound occurrence.
  4. An expression e is closed if it contains no free variable occurrences. All programs we execute are closed (no link-time errors).
  5. e[e'/x] is notation for the expression resulting from the operation of replacing all free occurrences of x in e with e'. For now, we assume that e' is a closed expression.
The notions of bound and free should be familiar to you from block-structured languages.


Operational Semantics for D

We are now ready to define the operational semantics for D.

Definition the computation relation ==> for language D is the least relation on closed expressions in D satisfying the following rules.

Value rule
v ==> v

Boolean rules: see above boolean language

+ rule
e1 ==> v1, e2 ==> v2
e1 + e2 ==> the integer sum of v1 and v2, provided v1 and v2 are integer constants

- rule
Similar to +.

= rule
e1 ==> v1, e2 ==> v2
e1 = e2 ==> True if v1 v2 are identical numbers, and ==> False if they are different numbers or not numbers.

If True rule
e1 ==> True, e2 ==> v2
If e1 Then e2 Else e3==> v2

If False rule
e1 ==> False, e3 ==> v3
If e1 Then e2 Else e3==> v3

(Anonymous) Function application rule
e1 ==> Function x -> e, e2 ==> v2, e[v2/x] ==> v
e1 e2 ==> v

Let Rec Function application rule

Example Executions

If in doubt, draw out the derivation trees that show the execution precisely.
If 3 = 4 Then 5 Else 4 + 2 ==> 6 because
  3 = 4 ==> False and
  4 + 2 ==> 6, because
    4 ==> 4 and
    2 ==> 2 and 4 plus 2 is 6.

(Function x -> If 3 = x Then 5 Else x + 2) 4 ==> 6, because of
above derivation

(Function x -> x x)(Function y -> y) ==> Function y -> y, because
  (Function y -> y)(Function y -> y) ==> Function y -> y

(Function f -> Function x -> f(f(x)))(Function x -> x - 1)(4) ==> 2 because
letting F abbreviate (Function x -> x - 1),
  (Function x -> F(F(x)))))(4) ==> 2, because
    F(F(4)) ==> 2, because
        F(4) ==> 3, because
          4 - 1 ==> 3.  And then,
        F(3) ==> 2, because
          3 - 1 ==> 2.

(Function x -> Function y -> x+y)
  ((Function x -> If 3 = x Then 5 Else x + 2) 4)
  (Function f -> Function x -> f(f(x)) (Function x -> x - 1)(4) ==> 8 by the above two executions

(Let Rec f x = If x = 0 then 1 else x + f (x - 1))(1) ==> 1 because
  letting F abbreviate (Let Rec f x = If x = 0 then 1 else x + f (x - 1)),
  If 1 = 0 then 1 else 1 + F (1 - 1) ==> 1, because
   1 = 0 ==> False, and
   1 + F (1 - 1) ==> 1, because
      F (1 - 1) ==> 0, because
      1 - 1 ==> 0, and
      If 0 = 0 then 1 else 0 + F (0 - 1)) ==> 0, because
        0 = 0 ==> True, and
        0 ==> 0

Mathematical Properties of D programs

Lemma. D is deterministic.
Proof. By inspection of the rules, at most one rule can apply at any time. (Need the Let Rec rule to prove this precisely)

Lemma. D is not normalizing: there is some e such that there is no v with e ==> v.
Proof. (Function x -> x x)(Function x -> x x) is not normalizing. Neither is 4 3.

A D Interpreter

As part of homework 3, you are to write a Caml eval function which takes D programs and produces D values as result, following the above operational semantics. File D-examples.ml contains the Caml type for D syntax as well as some sample executions. That file will be reviewed in class, and contains concrete D code for most of the examples in the remainder of these notes.

Pure functional programming in D

D doesn't have many features, but it is possible to do much more than you may at first think.

Here are the combinators as given in the D-examples.ml file.
(* First some abbreviations to save finger wear *)

let i s = Ident s            (* abbreviation for identifiers *)
let v s = Var(Ident s)       (* abbreviation for variables *)

(* super shorthand for common identnt/var names *)

let ix = i"x" (* ident x *)
let vx = v"x" (* variable x *)
let iy = i"y" (* ident y *)
let vy = v"y" (* variable y *)
let iz = i"z" (* ident z *)
let vz = v"z" (* variable z *)
let il = i"l" (* ident l *)
let vl = v"l" (* variable l *)
let ir = i"r" (* ident r *)
let vr = v"r" (* variable r *)

(* The classic pure functional combinators *)

let id = Function(ix,vx)                        (* I x = x *)
let k = Function(ix,Function(iy,vx))            (* K x y = x *)
let s = Function(ix,Function(iy,Function(iz,    (* S x y z = (x z) (y z) *)
let d = Function(ix, Appl(vx,vx))               (* D x = x x *)
Macros: It turns out that D can express a great deal more than it first appears. In particular, the expressive power of Caml's let, recursive definitions via let rec, and stuctures such as lists or other datatypes are all encodable in D, by applying a trick or two.

Encoding n-tuples and lists

We will define a 2-tuple (pairing) constructor;
From a pair you can get a 3-tuple by building it from pairs as (1, (2,3)), ... etc for n-tuples.

(* Pairs may be encoded as functions (not entirely adequate however) *)

let pr (l,r) =  (* make a pair with left element l and right element r *)

let prexample = pr(Int 4,pr(Int 5,Bool true))

(* projections left and right *)

let left e =  Appl(e,Function(ix,Function(iy,vx)))
let right e = Appl(e,Function(ix,Function(iy,vy)))
Test: try pr(4,5). That is
(Function l -> Function r -> Function x -> x l r) 4 5
which computes by computing
(Function x -> d 4 5)
which is a value and we are done. Now lets try left(pr(4,5)). We have pr(4,5)'s value from above; continuing,
(Function p -> p (Function x -> Function y -> y))(Function d -> d 4 5)
computes by computing
(Function d -> d 4 5) (Function x -> Function y -> y)
which computes by computing
(Function x -> Function y -> y) 4 5 
which computes by computing

Problems with this encoding of pairs:
Lists can also be implemented via pairs Here are the implementations.
(* Pairs may be encoded as functions (not entirely adequate however) *)

let pr (l,r) =  (* make a pair with left element l and right element r *)

let prexample = pr(Int 4,pr(Int 5,Bool true))

(* projections left and right *)

let left e =  Appl(e,Function(ix,Function(iy,vx)))
let right e = Appl(e,Function(ix,Function(iy,vy)))

(* Lists may be encoded as a pair consisting of the head and tail *)

let head = left
let tail = right
let emptylist = (Int 0)  (* something for empty list *)
let cons = pr
let length = Letrec(i"Length",ix,
    If(Equal(vx,emptylist),(Int 0),Plus(Appl(v"Length",tl(vx)),
        (Int 1))))));

let aList = cons(Plus(Int 1,Int 1),cons(Plus(Int 1,Int 1),cons(Int 3,emptylist)))

Other Examples of Expressiveness Within D

Functions of multiple arguments: use currying, just as is common in Caml.

Let is definable:

Let x = e in e' is defined as (Function x -> e') e 
An example: Let x = 3 + 2 in x + x End is (Function x -> x + x)(3 + 2), which evaluates to 10.

Sequencing. Notice there is no sequencing (;) operation. Why not? Answer: if e;e' is what you want to sequence, you might as well just write e', as e will never get used. This changes if Print or mutable state is added (operators with side effects). Sequencing is definable, nonetheless:

e ; e' is defined as (Function newvar -> e') e, where newvar is chosen so as not to be
free in  e' 
This will first execute e, throw away the value, and then execute e', returning its result as the final result of e;e'.

Freezing and thawing We can stop and re-start computation at will by freezing and thawing.

Encoding Recursion in D

D has a built-in Letrec to write recursive functions, but its actually not needed to write recursive functions! Some special trickery is needed. Wax those surfboards, a wave is coming.

Q: How can programs compute forever in D without Let Rec?

A: Easy: (Function x -> x x)(Function x -> x x). Corollary: D is not normalizing.

Russell's paradox

In Frege's set theory (circa 1900), sets were written as predicates P(x), which we can view as functions.

Now consider P = "the set of all sets that do not contain themselves as members"!:

P = Function x -> Not(x x)
(Note, it may make sense to have a set with itself as member: the set {{{{...}}}}, infinitely receding, has itself as a member; this only happens in so-called non-well-founded set theory).

Now, is P P? Namely is P a member of itself? This is written:

(Function x -> Not(x x)) (Function x -> Not(x x))
--if this were viewed as a D program, it would loop forever: it suffices to compute
Not((Function x -> Not(x x))(Function x -> Not(x x)))) 
Now, notice we have P is a member of itself if and only if it isn't, a contradiction!

Encoding recursion by passing self

Here is the idea:
Here is how a summation function can be defined around these ideas which summates the numbers 0..n for agrument n. First define
summate0 = Function this -> Function arg ->
  If arg = 0 Then 0 Else arg + this(this)(arg-1) + 1
Then we can write a function call as
  summate0(summate0)(7) (* summates numbers 0 .. 7 *)
In general, we can write the whole thing in D as
let summate =
  Let summ = (Function this -> Function arg ->
    If arg = 0 Then 0 Else arg + this(this)(arg-1) + 1)
    Function arg -> summ(summ)(arg)
and invoke as
summate 7 (* summates numbers 0..7 *)
so we don't have to let the world see the self-passing business.

The Y-Combinator. The Y-combinator is a further abstraction on this: summ can be abstracted to be some abstract body passed in itself as a higher-order function.
almosty = Function body -> 
      Let fun = (Function this -> Function arg ->
        Function arg -> (fun fun)(arg)
-- the body of summ above contains arg and this, so the abstract body body gets those things passed to it. almosty can be used by defining summate as
summate = almosty (Function thisthis -> Function arg ->
    If arg = 0 Then 0 Else arg + this(this)(arg-1) + 1)
The Y-combinator actually goes one more step and passes this(this) as argument, not just this, simplifying what we pass to Y:
y = Function body -> 
      Let fun = (Function this -> Function arg ->
        body(this this)(arg))
        Function arg -> (fun fun)(arg)
This combinator can then be used to define summate as
summate = y (Function thisthis -> Function arg ->
    If arg = 0 Then 0 Else arg + thisthis(arg-1) + 1)
-- the parameter thisthis is exactly used for a recursive call.

The above is almost the Y combinator given in the D-examples.ml file; the major difference is that version has fun inlined (repeated twice) instead of being defined via Let.

Call-by-name Parameter Passing

Definition Define a call-by-name evaluation relation ==> for D by replacing the Function application rule with the following rule.
call-by-name Function application rule
e1 ==> Function x -> e, e[e2/x] ==> v
e1 e2 ==> v

And, similarly a new rule for Let Rec is needed.

Freezing and Thawing, defined above, is a way to get call-by-name behavior in a call-by-value language.
Consider then the computation of
(Function x -> Thaw(x) + Thaw(x))Freeze(3-2)
-- 3-2 is not evaluated until we are inside the body of the function where it is thawed, and it is then evaluated two separate times. This is precisely the behavior of call-by-name parameter passing, so Freeze and Thaw can encode it by this means. The fact that 3-2 is executed twice shows the main weakness of call by name: repeated evaluation of the function argument.

Lazy or call-by-need evaluation is a version of call-by-name that caches evaluated function arguments the first time they are evaluated so it doesn't have to re-evaluate them in subsequent uses. Haskell is a pure functional language with lazy evaluation.

The (pure) lambda-calculus

A classic simple language with only functions: take D and remove the numbers, booleans, and conditional.

It is called the lambda-calculus because functions are written lambda x.e (using the Greek lambda character) instead of Function x -> e.

Fact: Numbers, booleans, and conditional can be encoded in the pure lambda-calculus.

Execution in the pure lambda calculus

This form of computation is interesting conceptually but is more distant from how actual computer languages execute.

Operational Equivalence

In this course we are taking a mathematical view of programs. What is a primary relation defined over a space of mathematical objects? Equivalence!

Examples. Equivalence is important!

Defining Operational Equivalence

We define equivalence in a manner dating all the way back to Leibniz:
Two programs are equivalent if and only if one can be replaced with the other at any place, and no external change in behavior will be noticed.

A more precise definition of equivalence

We define the notion of contexts C as follows. Examples of contexts and hole filling


  (Function z -> (Function x -> *) z)
  (Function q -> e)(*)
Hole filling:
(Function z -> (Function x -> *) z)[x+2]
Means "put x+2 in the hole(s) in the (Function z .. )term"; the result is
(Function z -> (Function x -> x+2) z)

Operational equivalence is defined simply as follows:

Definition e =~ e' if and only if for all contexts C, C[e] ==> v for some v if and only if C[e'] ==> v' for some v'.

v and v' can be anything because a bigger context could always test them some more: the context
Function x ->C'[x](C[e])
would first compute to C'[v], and then v is tested by context C'. So, v and v' above are going to have to be quite similar, and in fact it is easy to show that they must be identical if they are not functions.

Example Equivalences

Some general equivalence principles for D programs are defined.

Here are some laws.

Equivalence transformations on programs can be used to justify results of computations instead of directly computing with the evaluator; it is often easier.

An important equation relating Y:

Y f x =~ f (Freeze(Y f)) x
An important component of compiler optimization is applying transformations such as the above that preserve equivalence.

Technical Issue: capture-avoiding substitution For example,
 (Function z -> (Function x -> y + x) z){x + 2/y} = (Function z -> (Function x1 -> x + 2 + x1) z)
Observe about this example

Proving Equivalences Hold

Last modified: Tue Apr 2 17:41:29 EST 2002