Well maybe I got a little carried away there...
Explicit control operations are operations that explicitly alter the control of the evaluation process. Even in the most simple of languages, there are control operators:
return eIn D, the value of the function is whatever its whole body evaluates to. If in the middle of some complex conditional loop expression we have the final result of the computation, it is still necessary to complete the execution of the function. A
return statement
gets around this problem by immediately returning from the function
and aborting the rest of the function computation.
Another common class of control operator is the loop exit,
a.k.a. break. break is very similar to
return.
We are interested in studying this type of control operator, but are really more interested in more powerful control operators, the principal of which is exceptions and exception handlers. You are familiar with exceptions from the Caml exception mechanism.
Then there are the truly bizarre control operators: call/cc, shift/reset, control/prompt, ...
Then there is the nonsensical: goto. This operator is
just too raw. What does it mean to goto a label in a
function? Maybe that function isn't even executing. You can skip
past variable initializations. Java has no goto
(although it is a reserved word). Thesis: if you have enough other good
control operators, goto is not needed.
Conclusion: although control operators are not required, they sure
make programming more convenient. They can be considered a
"meta-operator", something that is "acting on" the evaluation process.
Return e
to D first since it is the simplest control operator.
It doesn't fit into the normal evaluation scheme. An example is
(Function x -> (If x = 0 Then 5 Else Return (4 + x)) - 8) 4since
x will not be zero, we have to avoid executing the code
"- 8".
But, evaluating the above means evaluating
(If 4 = 0 Then 5 Else Return (4 + 4)) - 8which means evaluating
(Return (4 + 4)) - 8And, the value of this should be the value of the LHS expression minus 8, according to the evaluation rule for minus. That won't work!
Add to the space of D language values, values of the
form Return v.
To evaluate Return, use the rule
Return ruleThen, add new rules for subtraction on top of the current rule,
e --> v
-----------------------------
Return e --> Return v
- Return left rule
e --> Return v
-----------------------------
e - e' --> Return v
- Return right ruleThis "bubbles up" the
e --> v, e --> Return v'
-----------------------------
e - e' --> Return v'
Return through the subtraction
operator. Similar rules need to be given for every evaluation
position of every D operator. So, using these new rules, the value of
(Return (4 + 4)) - 8is
Return 8So, the function should then return 8. We need a new function application rule which stops bubbling up function application results of the form
Return v, and just returns v.
Appl Return ruleA few other rules are needed for
e1 --> Function x -> e, e2 --> v2, e [v2/ x ] --> Return v
------------------------------------------------------
e1 e2 --> v
APPL, for the cases that
the function or argument itself returns.
Appl Return Function
e1 --> Return v
------------------------------------------------------
e1 e2 --> Return v
Appl Return ArgNote we still keep around the old function evaluation rule for the case that the function returned implicitly by dropping off the end of its execution, returning a value
e1 --> v1, e2 --> Return v
------------------------------------------------------
e1 e2 --> Return v
v which is not
Return v'.
There are two possible interpretations of Return Return v
-- one is to return from two levels of function call (the
interpretation the above rules give), or we can add the rule
Return Return ruleand restrict the previous
e --> Return v
-----------------------------
Return e-->Return v
Return rule the the case that
v is not of the form Return v'.
Return above can be extended to deal
with general exceptions. We define a language DX which is
D extended with a Caml-style exception mechanism.
Caml doesn't even have return; however, its effect is
easily simulated using exceptions.
(Function x -> (if x = 0 then 5 else return (4 + x)) - 8) 4is encoded as
Exception Return;;
(function x ->
try
(if x = 0 then 5 else raise return (4 + x)) - 8
with
return(n) -> n)
4;;
From this example you can get the idea of how any function can have
return encoded.
Now: an interpreter for Caml exceptions.
The basic idea is
not very different from Return above: make a new kind of value
Raise xn(v) to bubble up exception named xn with value v.
This is the generalization of values Return v above.
e Try xn(x) -> e' then handles exception named xn if it
arises in e. Note that Try
binds free x's in e'.
(Function x ->
Let Exception return In
(If x = 0 Then 5 Else Raise return (4 + x)) - 8
Try return(n) -> n
) 4
Exceptions are side effects like references: they can cause
action at a distance. Like all side effects, they should be used
sparingly in programs.
datatype term = ... (* D stuff *) ... | Raise of term * term | Try of term * term * ide * term | LetExn of ide * term | Exn of string
Exn("Return") names
a particular exception. Since DX is untyped, there
is no need to declare exceptions.
DX values now also include values of the form
Raise xn(v). We will use metavariable xn to
refer abstractly to an exception name Exn(s).
The rules for Raise and
Try are derived from the Return rules:
Raise xn(v) to the
space of D language values, in analogy to
Return v values.
Raise xn(v) are "bubbled up" to escape their
context, as were Return v values.
Try points are where the bubbling-up stops;
Return stopped bubbling up at function application
points.
... plus rules to bubblee --> Exn xn, e' --> v, for v not of the form "Raise ..."-----------------------------
Raise e(e') --> Raise xn(v)
e --> v for v not of the form Raise xn(v)
-----------------------------
Try e With xn(x) -> e' --> v
e --> Raise xn(v), e'[v/x] --> v'
-----------------------------
Try e With xn(x) -> e' --> v'
Raise xn(v) values up as
Return v were bubbled up, including Bubbling ae --> Raise xn(v)
-----------------------------
e - e' --> Raise xn(v)
Raise through a Raise also
requires rules (this case will hardly ever arise in practice):
Reconsider the above example.e --> Raise xn(v)
-----------------------------
Raise e(e') --> Raise xn(v)
e --> v, e' --> Raise xn(v)
-----------------------------
Raise e(e') --> Raise xn(v)
(Function x -> Try (If x = 0 Then 5 Else Raise return(4 + x)) - 8 With return(n) -> n) 4
Try (Raise return(4 + 4)) - 8
With return(n) -> n
is
Try (Raise return(8)) - 8
With return(n) -> n
And by bubbling, it suffices to compute
Try Raise return(8)
With return(n) -> n
which by Raise, returns value
8
Exceptions as presented above are almost identical to the Caml form.
The "bubbling up" method of interpretation is correct, but in practice
it is very inefficient. For instance, the -, etc operations
will always have to check if either argument is of the form
Return ... and act appropriately, greatly slowing down
code.
Better interpreters can be written that get around this problem, but for lack of time we address the problem in the context of compilers only.
We sketch how the DSR compiler could be extended to a
DSRX compiler. Exceptions are still "bubbled up",
but we can take much bigger steps, immediately popping back to the
nearest enclosing Try.
The closure conversion, A-translation, and function hoisting
algorithms can be extended to exception constructs, with new tuple forms
x = [ let tuples ] Try xn (y) -> [let tuples ] x = Raise y (* y a variable since body of Raise was A-translated *) x = Exceptionresulting.
The toC function can then be extended as follows.
Note, int exception_count and jmp_buf
nearest_Try are global C variables.
< setjmp.h> C library is used to return from a deeply
nested function call via setjmp/longjmp.
This library is a primitive nonlocal control operation.
Semantics of setjmp/longjmp: executing
setjmp(aTry) is something like the
fork library function in C:
setjmp(aTry) sets the current statement as a
return point to pop the stack to later
setjmp(aTry) returns 0 at this
initial execution, when the return point is set
longjmp(aTry,arg) pops
activation records
(function calls) off the stack and returns to the point
setjmp(aTry) was executed.
setjmp
instruction,
returning arg (hopefully non-0) as the result of
the setjmp this time. The code
should always branch on the result of
setjmp(aTry) to determine if the break point is
being set or returned to.
aTry is a struct containing data relevant to
the setjmp/longjmp; both of these calls will
mutate the Try (storing the return point, etc).
nearest_Try which at
the current point of execution contains the nearest enclosing Try
statement.
toCTuple(x = Let Exception xn In e) = xn = global_exception_count++;
toCtuples(e)
toCTuple(Raise xn(y)) = longjmp(nearest_Try,"[xn,y]") /* immediately pop off function activations,
returning to nearest enclosing setjmp, i.e. Try */
toCTuple(x = [ let tuples 1 ] Try xn (y) -> [ let tuples 2 ]) =
{ jmp_buf next_nearest_Try;
Word val;
next_nearest_Try = nearest_Try;
/* setjmp call below marks this point in code as a future possible return point */
/* nearest_Try is set to be this location in the code by the setjmp call */
/*val is 0 when the code is normally executed, but if it is returned to, it is an argument */
if ((val = setjmp(nearest_Try)) == 0)
{ /* 0 return indicates jump point was just set */
toCTuples([let tuples 1]);
nearest_Try = next_nearest_Try; /* pop the stack of Trys */}
else
{ /* non-0 indicates a longjmp brought us back to this set point */
nearest_Try = next_nearest_Try; /* pop the Try stack */
"[xn_raised,y]" = val; /* the setjmp value above, if not zero, is the value passed to longjmp */
if (xn == xn_raised) /* exception raised matches this Try */
{ toCTuples([let tuples 2]) }
else /* exception is not handled here; try next handle further up the stack */
longjmp(nearest_Try,val);
}
The setjmp/longjmp code is unnatural syntax, but
it gives enough of a primitive nonlocal control effect to allow
ML-style exceptions to be implemented.
This implementation only requires "bubbling" through all enclosing
Try statemements that do not match; other bubbling
steps are skipped.
Question: What will the above compiler do if an uncaught exception is raised?? How could a better solution to this be devised?