Pairing is the most fundamental form of data aggregation in programming. With pairing you build just about anything you want.
(x,y,z) via (x,(y,z)) etc.
expr type by adding
... | Pr of expr * expr | Left of expr | Right of exprExtend the
eval function by adding clauses
eval Pr(expr1,expr2) = Pr(eval(expr1),eval(expr2))
eval Left(expr) = match eval(expr) with
Pr(expr1,expr2) -> expr1
...
eval Right(expr) = ...
This is an "eager" pair, the components of the pair are evaluated.
Caml 2-tuples are eager, (2,3+4) evalues to
(2,7).
Question: if we
wanted any (e,e') to be considered a value immediately,
how would our evaluator be written?
The space of values is now bigger: The values are now either
Recall that 3-tuples can be encoded via two-tuples Pr(e,Pr(e',e'')),
and similarly for 4-tuples etc.
Operational semantics rules for tuples: an exercise.
What advantages do records have over tuples?
RecordObviously this makes ugly, hard-to-read code, but it works. For C-style structs, this encoding would work.{x = 5; y = 7; z = 6}maps to tuple(5,(7,6))
.xmaps toLeft,.ymaps toFunction x -> Left(Right(x)),.zmaps toFunction x -> Right(Right(x)).
But, in the case where records can grow or shrink, this encoding is fundamentally too weak. C++ structs can be subtypes of one another, so some fields that are not declared may in fact be present at run-time.
Recall Caml records are of the form
{size = 7; weight = 245.3; name = "Buzz"}
They can have any number of fields. Values are selected by syntax
record.size. We will use the same syntax in our D language extension, which we will name DR.
{size = blah; weight = blah}
and
{weight = blah; name = blah}
either record can be passed to a function Function x ->
x.weight.
x is any record with a weight field.
type label = Lab of stringRecords may be of arbitrary length, so a Caml list of label, expr pairs must be used to define record syntax. The DR expr type is
type expr = ... | Record of (label * expr) list | Select of expr * labelThe concrete syntax
{size = 7; weight = 245}
is then encoded as abstract syntax within Caml as
Record [(Lab "size", (Int 7)); (Lab "weight", (Int 245))]and e.size as Select(e,Lab"size").
The definition of values is extended from the values of D:
Records {field1 = v1; ..; fieldn = vn}
are values provided v1 through
vn are values.
Finally, we extend the D interpreter to a DR interpreter.
let rec eval e = match e with
...
| Record(body) -> Record(evalRecord(body))
| Select(exp,lab) -> match eval(exp) with
Record(fieldList) -> lookupRecord(fieldList,lab)
...
...
and evalRecord l = match l with
[] -> []
| (Lab l,exp)::xs = (Lab l,eval(exp))::evalRecord(xs)
and lookupRecord (record,Lab s) = match record with
[] -> raise FieldNotFound
| (Lab s1,v)::xs -> if s1 = s then v else
lookupRecord(xs,Lab s)
Is {} or Record [], the empty record, OK?
We will study DS, a language obtained by adding
Ref e (reference creation), e := e' (set),
and !e (get) syntactic operations to D.
State is our first example of a side effect in programming: the effect of assigning is not local since distant parts of the program may have the same cell and thus see the change.
Other side effects:
An example of the nonlocal nature of side effects.
let x = ref 9 in
let f z = x := !x + z in
x:= 5; f(5); !x
;;
- : int = 10
Since side effects are not local, they can make programs a lot more
difficult to understand.
Programming Moral:Reference cell side effects:
Be spare in your use of side effects
e ==> vwont work in the presence of memory.
c are an abstract form of memory location.
Dom(S) to refer to the domain of the finite map,
the set of all cells that it maps.
Ref e, e :=
e', !e, and cell names c. c are required: "Ref 5"
returns a reference to a heap location, which has to be some
kind of name for a spot in the heap.
We write
S { c |-> v }
to indicate the store S modified/extended so cell
c maps to value v.S(c) is the value of cell c in store S. < e,S0 > ==> < v,S >where at the start of the computation
S0 is an initial
(empty) store and S is the final store when the computation
terminates.
In the process of evaluation, cells c will begin to
appear in the program syntax, as references to memory locations.
Cells are values since they do not need to be evaluated,
so the space of DS values also includes cells
c.
The store is threaded along the flow of control.
There becomes more dependency between the rules, even the ones that don't directly manipulate the store. From the function application rule you should get an idea of the change needed to the other rules.
...Note how the store here is threaded through the different evaluations, showing how changes in the store in one place propagate to the store in other places, and in a fixed order that reflects the intended evaluation order.
Function application rule
<e1, S1 > ==> <Function x -> e, S2 >, <e2,S2 > ==> <v2, S3 >, <e [v2/ x ], S3 > ==> <v, S4 >
------------------------------------------------------
<e1 e2, S1 > ==> <v, S4 >
Rules for the memory operations are as follows.
...Here are some examples of execution with state to ponder. Note these work identically in Caml.
Ref erule
<e, S1 > ==> <v, S2 >------------------------------------------------------
<Ref e, S1 > ==> <c, S2 { c |-> v } >forcnot inDom(S2), i.e. a new cell name
!erule
<e, S1 > ==> <c, S2 >
------------------------------------------------------
<!e, S1 ==> <v, S2 >whereS2(c) = v
e := e'rule
<e1, S1 > ==> <c, S2 >, <e2, S2 > ==> <v, S3 >
------------------------------------------------------
<e1:= e2, S1 ==> <v, S3 { c |-> v } >
!(!(Ref Ref 5)) + 4
(Function y -> If !y = 0 Then y Else 0)(Ref 7)
Let x = ref 4 In Let y = ref 5 In (If !x = 0 Then
x Else y) := 6
Let y = Ref 0 In ((Function x -> y := x)(Ref 5)) := 6; !!y
We show <(Function y -> If !y = 0 Then y Else 0)(Ref 7),empty >
==> <0, {c |-> 7} >.
This matches the conclusion of the function application rule, provided
we show three things:
<(Function y -> If !y = 0 Then y Else 0),empty >
==> <(Function y -> If !y = 0 Then y Else 0),empty >
<Ref 7,empty >
==> <c, {c |-> 7} >
<(If !y = 0 Then y Else 0)[c/y], {c |-> 7} >
==> <0, {c |-> 7} >
Ref
rule above; lets work further on the third.
<If !c = 0 Then c Else 0, {c |-> 7} >
==> <0, {c |-> 7} >
because by the If rule,
<!c = 0, {c |-> 7} > ==> <False, {c |-> 7} >,
which follows in turn by the = rule because
<!c, {c |-> 7} > ==> <7, {c |-> 7} >.
The operational semantics clearly defines the meaning of DS programs, but we would also like to briefly consider how the interpreter may be implemented in Caml. Here is the abstract syntax.
type ident = Ident of string type expr = Var of ident | Function of ident * expr | Appl of expr * expr | Letrec of ident * ident * expr | Plus of expr * expr | Minus of expr * expr | Equal of expr * expr | And of expr * expr| Or of expr * expr | Not of expr | If of expr * expr * expr | Int of int | Bool of bool Ref of expr | Set of expr * expr | Get of expr | Cell of int
c from the operational semantics.
Cell(1),Cell(2),Cell(3),Cell(4),...
Cells in them before they
start executing, but as memory is allocated, Cell values
will start appearing.
We have two choices in writing an interpreter for DS.
eval(e,s) for expr e and state
s will return (v,s'), the final state
and final value.
(* declare all the expr, etc types globally (too hard to do it "right") *)
(* put the store functionality in a separate module. *)
module type STORE =
sig
(* ... *)
end
(* the Store structure implements a (functional) store. A simple
implementation could be via a list of pairs such as
[((Cell 2),(Int 4)); ((Cell 3),Plus((Int 5),(Int 4))); ... ]
module Store : STORE =
type store = (* ... *)
struct
let empty = (* initial empty store *)
let fresh = (* a simple object which returns a fresh Cell name *)
let count = ref 0 in
function () -> ( count := !count + 1; Cell(!count) )
(* note: this is not purely functional! its difficult to make fresh
purely functional *)
(* look up value of cell c in store s *)
let lookup (s,c) = (* ... *)
(* add or modify aCellName to aValue in store s, returning new store *)
let modify(s,c,v) = (* ... *)
end
(* evaluator is then a functor taking a store module *)
module DSEvalFunctor =
functor (Store : STORE) ->
struct
(* ... *)
let eval (e,s) = match e with
(Int n) -> ((Int n),s) (* values don't modify store *)
| Plus(e,e') ->
let (Int n,s') = eval(e,s) in
let (Int n',s'') = eval(e',s') in
(Int (n+n'),s'')
(* other cases such as application are a similar store threading *)
| Ref(e) -> let (v,s') = eval(e,s) in
let c = Store.fresh() in
(c,Store.modify(s',c,v))
| Get(e) -> let (Cell(n),s') = eval(e,s) in
(Store.lookup(Cell(n)),s)
| Set(e,e') -> (* exercise *)
end
module DSEval = DSEvalFunctor(Store)
Plus, etc) is totally
ignorant of the store
Ref/Set/Get) imperatively
extends/updates the store
;", while and
for-loops, thus becomes relevant
(Question: why was it previously irrelevant??). These syntactic concepts are easily defined as macros, so we do not add them as official syntax:
e1 ; e2=(Function x -> e2)(e1)
While e Do e'=(Let Rec f x = If e Then f(e') Else e)(0)
...
Let x = Ref 0 In x := xThis is the simplest store cycle, a cell that points directly to itself.
x, what
does !!!!!!!!!!! x return?Question: Can such a form of a cycle be written in Caml?
A more subtle form of cycle is when a function is placed in a cell, and the body of that function refers to the cell.
Let c = Ref 0 In c := (Function x -> If x = 0 Then 0 Else 1 + !c(x-1)); !c(10)--cell
c contains a function which refers to the cell,
and thus the function. !x operator to get the value in a cell.
x = x + 1.
x on the left of the assignment is an l-value,
and the x in x + 1 is an r-value
x
:= !x + 1.
x for x a ref
type) or its
value (!x) is being referred to
x = (l = left of the assignment), the
cell xis needed to perform the store, and
x + 1, the value in the
cell is needed, so !x + 1 is written.
x, x[3] (array)),
5, 0 == 1,
sin(4.3)).
f(3) := 7, where
f is a function returning a cell.
type expr = ... Ref of expr | Set of var * expr | Get of exprFor the variable on the left hand side of an assignment, we need the address and not the contents of the variable.
free()
eval-uation
We are briefly going to touch on some efficiency issues in our interpreters as we have been defining them.
Goal: get rid of explicit substitutions. A "low level" interpreter would never be copying the function argument to each position in the function body. To compute
(Function x -> x x x)(whopping expr)
(whopping expr)(whopping expr)(whopping expr)is computed, tripling the size of the data.
We will study translations for DSR. Missing language features that we will study later (and not consider when studying translations) include objects and classes, exceptions, and types.
Last modified: Thu Apr 18 12:28:10 EDT 2002