(2+3)*(3-4) has children
2+3 and 3-4 because of how
* is computed.
e ==> v means a program e computed
to a final result v (a value).
(node A): (Function x -> x + 2) (3 + 2 + 5) ==> 12 because (node B, child of A): 3 + 2 + 5 ==> 10, because (node C, child of B): 3 + 2 ==> 5, and (node B, again): 5 + 5 ==> 10; and then, (node E, child of A): 10 + 2 ==> 12.... In general, to compute a function application,
e ==> v, where e is a
program in the language.
e ==> v is mathematically a 2-place relation between expressions
of the language, e, and values of the language,
v.
5, and functions Function x -> ....
are also values since they don't compute to anything.
e and v are
metavariables, meaning they denote an arbitrary expression or
value, and should not be confused with the (regular) variables that
are part of programs. In the beginning I will underline metavariables
to underscore the difference, but eventually we will drop this
convention for brevity.
Definition.The boolean logic expressions e
consist of values True and False, and
expressions e And e, e Or e, Not e, and
e Implies e.
type boolexp = True | False | Not of boolexp | And of boolexp * boolexp | Or of boolexp * boolexp | Implies of boolexp * boolexp(Note: We are going to use Capitalized keywords in all of our little language syntax to avoid potential conflicts with e.g. Caml code.)
==> satisfying the
following rules.
True ruleComments on the rules
----------------------------
True==>TrueFalse rule
----------------------------
False==>FalseNot rule
e==>v
-----------------------------
Not e==>the negation ofvAnd rule
e1==>v1,e2==>v2
--------------------------------------
e1 And e2==>the logical and ofv1andv2Or, Implies rules: should be clear from above.
e ==> v
amounts to constucting a sequence of rule applications for
which each the final rule application logically concludes with
e ==> v.
This computation is a tree because there are two subcomputations necessary for each binary operator.Not(Not(False)) And True ==> False, because by the And rule,
True ==> True, andNot(Not(False)) ==> False, the latter because
Not(False) ==> True, because
False ==> False.
==> is the "least" relation satisfying the rules?
Lemma. The boolean language is
deterministic: if e ==> v and e ==>
v', then v = v'.
Proof. By induction on the height of the proof tree.
QED.
Lemma. The boolean language is
normalizing: For all boolean expressions e,
there is some value v where e ==> v.
Proof. By induction on the size of e.
QED.
Question: Suppose we left off the True rule by mistake; what nice property would fail?
Given an operational semantics defined via relation
==>, there is a corresponding (Caml) evaluator function
eval.
Note, the Caml function eval takes the program
e as argument in the form of its syntax tree, for instance
Plus(Int(1),Times(Int(2),Int(3))).
Definition. A (Caml) interpreter function
eval faithfully implements an
operational semantics e ==> v if
e ==> vif and only ifeval(e)returns resultv.
eval as follows.
let rec eval exp =
match exp with
True -> True
| False -> False
| Not(exp0) -> (match eval exp0 with
True -> False
| False -> True)
| And(exp0,exp1) -> (match (eval exp0, eval exp1) with
(True,True) -> True
| (_,False) -> False
| (False,_) -> False)
| Or(exp0,exp1) -> (match (eval exp0, eval exp1) with
(False,False) -> False
| (_,True) -> True
| (True,_) -> True)
| Implies(exp0,exp1) -> (match (eval exp0, eval exp1) with
(False,_) -> True
| (True,True) -> True
| (True,False) -> False)
The only difference between the operational semantics and evaluator is
the evaluator is a function, so we start with the bottom-left
expression in a rule, use the evaluator to recursively produce the
value(s) above the line in the rule, and finally compute and return
the value below the line in the rule.
e ==>
v if and only if eval(e) returns v
as result.We will go back and forth between these two forms during the course. The operational semantics form is used because it is independent of any particular programming language. The evaluator form is good because you can test your evaluator on real code.
Question: Why not just use interpreters and forget
about the operational semantics approach?
Answer: Then the whole exercise is circular, since we
don't really know what the Caml compiler is doing. Operational
semantics provides a foundation free of any particular language.
Definition. A metacircular interpreter is an interpreter for (possibly a subset of) language X that is written in language X. Metacircular interpreters give you some idea of how a language works, but suffer from the above non-foundational problems. A metacircular interpreter for Lisp is a classic programming language theory exercise.
D is a "Diminutive" pure functional programming language.
(5 3).
lambda x.e instead of Function x
-> e, where lambda is the Greek lowercase
character. It is the original higher-order functional language, and
dates from the 40's (!). More later on this.
Definition The expressions, e, of the
D language are inductively defined as the least set
including
x,
Function x -> e and function
application e e,
Let Rec f x = e
0, 1, -1, 2, -2, ... and numerical
operations + - = ,
True, False and boolean operations
And, Or, Not,
If e Then e Else e.
0, 1, -1, ...,
True and False,
Function x -> e
Let Rec x f = e.
e meaning an
arbitrary D expression, v meaning an
arbitrary expression that is a value, and x meaning an expression
which is a variable.
type ident = Ident of string type expr = Var of ident | Function of ident * expr | Appl of expr * expr | Letrec of ident * ident * expr | Plus of expr * expr | Minus of expr * expr | Equal of expr * expr | And of expr * expr| Or of expr * expr | Not of expr | If of expr * expr * expr | Int of int | Bool of bool
expr is the type of D
expressions within Caml
ident is needed because function parameters must
be variables, they can't be any other expression. This is one way to
force this.
(Function x -> x + 2) (3 + 2 + 5)"
expr type:
Appl(Function(Ident"x",Plus(Var(Ident"x"),2)),Plus(3,Plus(2,5))) is
the abstract syntax corresponding to the above concrete syntax.
(Function x -> x + 2) (3 + 2 + 5) ==> 12 because
3 + 2 + 5 ==> 10, because
3 + 2 ==> 5, and
5 + 5 ==> 10; and then,
10 + 2 ==> 12.
Note how in this example, the argument is substituted for the
variable in the body -- this gives us a rewriting interpreter.
(Function x -> x + 1) 2 will compute by
substituting 2 for x in the function's
body x + 1,
i.e. by computing 2 + 1.
(Function x -> Function x -> x)(3) should not evaluate to
(Function x -> 3) since the inner x is bound by
the inner paramater.
Definition
x
occurs in e if
x appears somewhere in e. Note we refer to
variable uses only, not definitions.
x in Function
x -> e are
bound; any free occurrences of x in e
here are bound occurrences in Function
x -> e. Similarly, occurrences of f and
x are bound in Let Rec f x = e.
x occurs free in e if it
has an occurrence in e which is not a bound occurrence.
e is closed if it contains no free
variable occurrences. All programs we execute are closed (no
link-time errors).
e[e'/x] is notation for the expression resulting
from the operation of replacing all free
occurrences of x in e with e'. For now,
we assume that e' is a closed expression.
Examples.
x occurs free in 3 + x, and
occurs both bound and free in expression x (Function x -> x).
(Function x -> x) (Function x -> x) is
closed.
x (Function x -> x)[(Function x -> x+1)/x] is
(Function x -> x+1) (Function x -> x).
(y + y)[3/y] is 3 + 3
(Function y -> y + y)[3/y] is (Function y
-> y + y) since there are no free y to
substitute for.
If 3 = 4 Then 5 Else 4 + 2 ==> 6 because
3 = 4 ==> False and
4 + 2 ==> 6, because
4 ==> 4 and
2 ==> 2 and 4 plus 2 is 6.
(Function x -> If 3 = x Then 5 Else x + 2) 4 ==> 6, because of
above derivation
(Function x -> x x)(Function y -> y) ==> Function y -> y, because
(Function y -> y)(Function y -> y) ==> Function y -> y
(Function f -> Function x -> f(f(x)))(Function x -> x - 1)(4) ==> 2 because
letting F abbreviate (Function x -> x - 1),
(Function x -> F(F(x)))))(4) ==> 2, because
F(F(4)) ==> 2, because
F(4) ==> 3, because
4 - 1 ==> 3. And then,
F(3) ==> 2, because
3 - 1 ==> 2.
(Function x -> Function y -> x+y)
((Function x -> If 3 = x Then 5 Else x + 2) 4)
(Function f -> Function x -> f(f(x)) (Function x -> x - 1)(4) ==> 8 by the above two executions
(Let Rec f x = If x = 0 then 1 else x + f (x - 1))(1) ==> 1 because
letting F abbreviate (Let Rec f x = If x = 0 then 1 else x + f (x - 1)),
If 1 = 0 then 1 else 1 + F (1 - 1) ==> 1, because
1 = 0 ==> False, and
1 + F (1 - 1) ==> 1, because
F (1 - 1) ==> 0, because
1 - 1 ==> 0, and
If 0 = 0 then 1 else 0 + F (0 - 1)) ==> 0, because
0 = 0 ==> True, and
0 ==> 0
Lemma. D is not normalizing: there
is some e such that there is no v with
e ==> v.
Proof. (Function x -> x x)(Function x -> x
x) is not normalizing. Neither is 4 3.
eval function
which takes D programs and produces D values as result, following the
above operational semantics. File D-examples.ml contains the Caml type
for D syntax as well as some sample executions. That file
will be reviewed
in class, and contains concrete D code for most of
the examples in the remainder of these notes.
let
with Caml.
D-examples.ml file.
(* First some abbreviations to save finger wear *)
let i s = Ident s (* abbreviation for identifiers *)
let v s = Var(Ident s) (* abbreviation for variables *)
(* super shorthand for common identnt/var names *)
let ix = i"x" (* ident x *)
let vx = v"x" (* variable x *)
let iy = i"y" (* ident y *)
let vy = v"y" (* variable y *)
let iz = i"z" (* ident z *)
let vz = v"z" (* variable z *)
let il = i"l" (* ident l *)
let vl = v"l" (* variable l *)
let ir = i"r" (* ident r *)
let vr = v"r" (* variable r *)
(* The classic pure functional combinators *)
let id = Function(ix,vx) (* I x = x *)
let k = Function(ix,Function(iy,vx)) (* K x y = x *)
let s = Function(ix,Function(iy,Function(iz, (* S x y z = (x z) (y z) *)
Appl(Appl(vx,vz),(Appl(vy,vz))))))
let d = Function(ix, Appl(vx,vx)) (* D x = x x *)
Macros:
Function x -> ... "
kind of thing.
id
above) with D variables: id above is a macro name, not a D variable.
let, recursive definitions via let rec, and
stuctures such as lists or other datatypes are all encodable in
D, by applying a trick or two.
(1,
(2,3)), ... etc for n-tuples.
(* Pairs may be encoded as functions (not entirely adequate however) *) let pr (l,r) = (* make a pair with left element l and right element r *) Appl(Appl(Function(il,Function(ir,Function(ix,Appl(Appl(vx,vl),vr)))),l),r) let prexample = pr(Int 4,pr(Int 5,Bool true)) (* projections left and right *) let left e = Appl(e,Function(ix,Function(iy,vx))) let right e = Appl(e,Function(ix,Function(iy,vy)))Test: try
pr(4,5). That is
(Function l -> Function r -> Function x -> x l r) 4 5which computes by computing
(Function x -> d 4 5)which is a value and we are done. Now lets try
left(pr(4,5)). We have pr(4,5)'s
value from above; continuing,
(Function p -> p (Function x -> Function y -> y))(Function d -> d 4 5)computes by computing
(Function d -> d 4 5) (Function x -> Function y -> y)which computes by computing
(Function x -> Function y -> y) 4 5which computes by computing
5Voila!
left (Function x -> 0) ==> 0but a function shouldn't be a pair! There should have been a run-time error here.
right(pr(3,pr(4,5))); it will evaluate to the
value pr(4,5) one
might at first think, but it will really return (Function x -> x 4
5). We can
only guess that this is intended to be a pair.
[1;2;3] is represented by pr(1, pr(2,
pr(3, emptylist)))
emptylist is some agreed-on empty list, 0 for us.
(* Pairs may be encoded as functions (not entirely adequate however) *)
let pr (l,r) = (* make a pair with left element l and right element r *)
Appl(Appl(Function(il,Function(ir,Function(ix,Appl(Appl(vx,vl),vr)))),l),r)
let prexample = pr(Int 4,pr(Int 5,Bool true))
(* projections left and right *)
let left e = Appl(e,Function(ix,Function(iy,vx)))
let right e = Appl(e,Function(ix,Function(iy,vy)))
(* Lists may be encoded as a pair consisting of the head and tail *)
let head = left
let tail = right
let emptylist = (Int 0) (* something for empty list *)
let cons = pr
let length = Letrec(i"Length",ix,
If(Equal(vx,emptylist),(Int 0),Plus(Appl(v"Length",tl(vx)),
(Int 1))))));
let aList = cons(Plus(Int 1,Int 1),cons(Plus(Int 1,Int 1),cons(Int 3,emptylist)))
Function and
application to make a Turing-complete programming language. This
language is known as the pure lambda calculus, and functions are
usually written as lambda x.e intead of Function x ->
e.
Let is definable:
Let x = e in e' is defined as (Function x -> e') eAn example:
Let x = 3 + 2 in x + x End is
(Function x -> x + x)(3 + 2), which evaluates to 10.
Sequencing.
Notice there is no sequencing (;) operation. Why not? Answer: if
e;e' is what you want to sequence, you might as well just
write e', as
e will never get used. This changes if Print or mutable state is added
(operators with side effects).
Sequencing is definable, nonetheless:
e ; e' is defined as (Function newvar -> e') e, where newvar is chosen so as not to be free in e'This will first execute
e, throw away the value, and then
execute e', returning its result as the final result of e;e'.Freezing and thawing We can stop and re-start computation at will by freezing and thawing.
Freeze e is defined as Function newvar -> e Thaw e is defined as e(0)(
newvar should be a fresh variable, so its not free in
e; the 0 above could be any value)
Freeze(e) freezes e, keeping it from being
computed. Thaw(e) starts up a frozen computation.
Let x = Freeze(2+3) in Thaw(x) + Thaw(x)--this has same value as without
Freeze/Thaw, but
2+3 evaluated twice.
Q: How can programs compute forever in D without Let Rec?
A: Easy: (Function x -> x x)(Function x -> x x).
Corollary: D is not normalizing.
x x is a function being
applied to itself!
Function x -> x < 2 is the set of all numbers less than 2.
e member-of S iff
S(e) is true. (Function x -> x < 2)(1) is true, 1 is
in this "set".
Now consider P = "the set of all sets that do not contain
themselves as members"!:
P = Function x -> Not(x x)(Note, it may make sense to have a set with itself as member: the set {{{{...}}}}, infinitely receding, has itself as a member; this only happens in so-called non-well-founded set theory).
Now, is P P? Namely is P a
member of itself? This is written:
(Function x -> Not(x x)) (Function x -> Not(x x))--if this were viewed as a D program, it would loop forever: it suffices to compute
Not((Function x -> Not(x x))(Function x -> Not(x x))))Now, notice we have
P is a member of itself if and only
if it isn't, a contradiction!
(function x -> not(x x)) (function x ->
not(x x)) is not typeable in Caml for the same reason the
predicate is
not typeable in Russell's ramified theory of types.
# function x -> not (x x);;
^
This expression has type 'a -> 'b but is here used with type 'a
Let Rec.
summate0 = Function this -> Function arg -> If arg = 0 Then 0 Else arg + this(this)(arg-1) + 1Then we can write a function call as
summate0(summate0)(7) (* summates numbers 0 .. 7 *)
summate0 always expects its first argument
this to be itself
this) and pass another copy for
future duplication.
summate0(summate0) primes the pump
by giving it an initial extra copy of itself.
let summate =
Let summ = (Function this -> Function arg ->
If arg = 0 Then 0 Else arg + this(this)(arg-1) + 1)
In
Function arg -> summ(summ)(arg)
and invoke as
summate 7 (* summates numbers 0..7 *)so we don't have to let the world see the self-passing business.
summ
can be abstracted to be some abstract body passed in
itself as a higher-order function.
almosty = Function body ->
Let fun = (Function this -> Function arg ->
body(this)(arg))
In
Function arg -> (fun fun)(arg)
-- the body of summ above contains arg and
this, so the abstract body body gets those
things passed to it.
almosty can be used by defining summate as
summate = almosty (Function thisthis -> Function arg ->
If arg = 0 Then 0 Else arg + this(this)(arg-1) + 1)
The Y-combinator actually goes one more step and passes
this(this) as argument, not just this,
simplifying what we pass to Y:
y = Function body ->
Let fun = (Function this -> Function arg ->
body(this this)(arg))
In
Function arg -> (fun fun)(arg)
This combinator can then be used to define summate as
summate = y (Function thisthis -> Function arg ->
If arg = 0 Then 0 Else arg + thisthis(arg-1) + 1)
-- the parameter thisthis is exactly used for a recursive call.
The above is almost the Y combinator given in the D-examples.ml file; the major difference is
that version has fun inlined (repeated twice) instead of being defined
via Let.
==> for D by replacing the Function application rule with the following rule.
call-by-name Function application ruleFreezing and Thawing, defined above, is a way to get call-by-name behavior in a call-by-value language.
e1==>Function x -> e,e[e2/x]==>v
------------------------------------------------------
e1 e2==>vAnd, similarly a new rule for
Let Recis needed.
(Function x -> Thaw(x) + Thaw(x))Freeze(3-2)--
3-2 is not evaluated until we are inside the body of the function
where it is thawed, and it is then evaluated two separate times. This
is precisely the behavior of call-by-name parameter passing, so Freeze
and Thaw can encode it by this means. The
fact that 3-2 is executed twice shows the main weakness of call by
name: repeated evaluation of the function argument.
It is called the lambda-calculus because functions are written
lambda x.e (using the Greek lambda character) instead of
Function x -> e.
Fact: Numbers, booleans, and conditional can be encoded in the pure lambda-calculus.
Execution in the pure lambda calculus
(Function x -> e) e' ==>
e[e'/x] is the (only) execution rule,
called beta reduction
=~ (written on the board as
= with a ~ above it) defined for all
D programs.
(eta-conversion) (Function x -> e) =~ (Function z -> (Function x -> e) z) for z not free in e.This equivalence is similar to the proxy pattern in object-oriented programming.
Freeze/Thaw syntax is
Thaw(Freeze(e)) =~ eOne of these programs may be replaced by the other without ill effects (besides perhaps changing execution time), so they are equivalent.
Two programs are equivalent if and only if one can be replaced with the other at any place, and no external change in behavior will be noticed.
x
+ 1 - 1 =~ x.
e1 =~ e2,
e1 in the * position
and run the program;
e2.
C as follows.
*
punched in it: replace some subterm(s) of any expression with
*.
C[e], means
mean place e in the holes * in C
Contexts:
(Function z -> (Function x -> *) z) (Function q -> e)(*)Hole filling:
(Function z -> (Function x -> *) z)[x+2]Means "put
x+2 in the hole(s) in the (Function z
.. )term"; the result is
(Function z -> (Function x -> x+2) z)
e may have free variables in it
which become bound under substitution ; this is known as
capture.
x in x+2 is captured in the above
example.
Definition e =~ e' if and only if for
all contexts C, C[e] ==>
v for some v if and only if
C[e'] ==> v' for some
v'.
v and v', they could in
theory be different.
Function x ->C'[x](C[e])would first compute to
C'[v], and then v is
tested by context C'. So, v and
v' above are going to have to be quite similar, and in
fact it is easy to show that they must be identical if they are not
functions.
Here are some laws.
e =~ e, symmetry e =~ e'
if e' =~ e, transitivity e =~ e'' if
if e =~ e' and e' =~ e''
C[e] =~ C[e'] if e =~ e' (congruence)
(Function x -> e)(v) =~ e{v/x} (this is
beta-equivalence; e{v/x} is capture-avoiding
substitution, defined below)
(Function x -> e) =~ (Function z -> (Function x -> e) z) (eta)
(Function x -> e) =~ (Function y -> e{y/x}) (alpha)
n + n' =~ the sum of numbers n and
n', and similar laws for -, And, Or,
Not, =;
If True Then e else e' =~ e, and similar for If False...
e ==> v then e
=~ v (evaluation)
An important equation relating Y:
Y f x =~ f (Freeze(Y f)) xAn important component of compiler optimization is applying transformations such as the above that preserve equivalence.
e{e'/x} to
deal with capture.
e{e'/x} is a generalized form of
substitution that differs from our previously defined substitution
operation e[e'/x] in that e' does not have
to be closed.
x with
e', but avoid capture from occurring.
This is implemented by renaming any capturing variable bindings in
e.
(Function z -> (Function x -> y + x) z){x + 2/y} = (Function z -> (Function x1 -> x + 2 + x1) z)
Observe about this example
x + 2 would be captured if we just stuffed
x + 2 in for y, a bad thing.
Function x ->(Function z -> (Function x -> y + x) z)(x + 2)if we ignored capture the beta rule would give us
Function x ->(Function z -> (Function x -> (x + 2) + x) z)which is clearly not equivalent to the above program.
x to a fresh
variable not occurring in e or e', x1 in
this case.
1 + 1 =~ 2 is hard to prove.
Last modified: Tue Apr 2 17:41:29 EST 2002