Exceptions and other Control Operators

Up to now there has been an inexorable march through the evaluation process. To evaluate this, that had to be done, and in turn that etc etc, which returns a value which in turn gets analyzed.... Its time to break the chains of tyrranical evaluation and free computation to do as it chooses!!!

Well maybe I got a little carried away there...

Explicit control operations are operations that explicitly alter the control of the evaluation process. Even in the most simple of languages, there are control operators:

return e
In D, the value of the function is whatever its whole body evaluates to. If in the middle of some complex conditional loop expression we have the final result of the computation, it is still necessary to complete the execution of the function. A return statement gets around this problem by immediately returning from the function and aborting the rest of the function computation.

Another common class of control operator is the loop exit, a.k.a. break. break is very similar to return.

We are interested in studying this type of control operator, but are really more interested in more powerful control operators, the principal of which is exceptions and exception handlers. You are familiar with exceptions from the Caml exception mechanism.

Then there are the truly bizarre control operators: call/cc, shift/reset, control/prompt, ...

Then there is the nonsensical: goto. This operator is just too raw. What does it mean to goto a label in a function? Maybe that function isn't even executing. You can skip past variable initializations. Java has no goto (although it is a reserved word). Thesis: if you have enough other good control operators, goto is not needed. Conclusion: although control operators are not required, they sure make programming more convenient. They can be considered a "meta-operator", something that is "acting on" the evaluation process.

Interpreting Control Operators: Return

How would you imagine exceptions are executed?
Lets us consider adding Return e to D first since it is the simplest control operator. It doesn't fit into the normal evaluation scheme. An example is
(Function x -> (If x = 0 Then 5 Else Return (4 + x)) - 8) 4
since x will not be zero, we have to avoid executing the code "- 8". But, evaluating the above means evaluating
(If 4 = 0 Then 5 Else Return (4 + 4)) - 8
which means evaluating
(Return (4 + 4)) - 8
And, the value of this should be the value of the LHS expression minus 8, according to the evaluation rule for minus. That won't work!
Instead, we need to allow an exception at this point in the execution of minus.

Add to the space of D language values, values of the form Return v.
To evaluate Return, use the rule

Return rule
e --> v
-----------------------------
Return e --> Return v

Then, add new rules for subtraction on top of the current rule,
- Return left rule
e --> Return v
-----------------------------
e - e' --> Return v

- Return right rule
e --> v, e --> Return v'
-----------------------------
e - e' --> Return v'

This "bubbles up" the Return through the subtraction operator. Similar rules need to be given for every evaluation position of every D operator.

So, using these new rules, the value of

(Return (4 + 4)) - 8
is
Return 8
So, the function should then return 8. We need a new function application rule which stops bubbling up function application results of the form Return v, and just returns v.

Appl Return rule
e1 --> Function x -> e, e2 --> v2, e [v2/ x ] --> Return v
------------------------------------------------------
e1 e2 --> v

A few other rules are needed for APPL, for the cases that the function or argument itself returns.
Appl Return Function
e1 --> Return v
------------------------------------------------------
e1 e2 --> Return v

Appl Return Arg
e1 --> v1, e2 --> Return v
------------------------------------------------------
e1 e2 --> Return v

Note we still keep around the old function evaluation rule for the case that the function returned implicitly by dropping off the end of its execution, returning a value v which is not Return v'. There are two possible interpretations of Return Return v -- one is to return from two levels of function call (the interpretation the above rules give), or we can add the rule
Return Return rule
e --> Return v
-----------------------------
Return e --> Return v

and restrict the previous Return rule the the case that v is not of the form Return v'.

Interpreting Exceptions

The translation of Return above can be extended to deal with general exceptions. We define a language DX which is D extended with a Caml-style exception mechanism. Caml doesn't even have return; however, its effect is easily simulated using exceptions.
(Function x -> (if x = 0 then 5 else return (4 + x)) - 8) 4
is encoded as
Exception Return;;

(function x ->
  try
    (if x = 0 then 5 else raise return (4 + x)) - 8
  with
    return(n) -> n)
  4;;
From this example you can get the idea of how any function can have return encoded.

Now: an interpreter for Caml exceptions.
The basic idea is not very different from Return above: make a new kind of value Raise xn(v) to bubble up exception named xn with value v. This is the generalization of values Return v above. e Try xn(x) -> e' then handles exception named xn if it arises in e. Note that Try binds free x's in e'.

(Function x ->
  Let Exception return In
    (If x = 0 Then 5 Else Raise return (4 + x)) - 8
    Try return(n) -> n
 ) 4
Exceptions are side effects like references: they can cause action at a distance. Like all side effects, they should be used sparingly in programs.

Implementing the DX Interpreter

The DX term datatype is
datatype term = ... (* D stuff *) ...
| Raise of term * term | Try of term * term * ide * term 
| LetExn of ide * term | Exn of string
Exn("Return") names a particular exception. Since DX is untyped, there is no need to declare exceptions. DX values now also include values of the form Raise xn(v). We will use metavariable xn to refer abstractly to an exception name Exn(s).

The rules for Raise and Try are derived from the Return rules:

Here are the rules.
e --> Exn xn, e' --> v, for v not of the form "Raise ..."
-----------------------------
Raise e(e') --> Raise xn(v)

e --> v for v not of the form Raise xn(v)
-----------------------------
Try e With xn(x) -> e' --> v

e --> Raise xn(v), e'[v/x] --> v'
-----------------------------
Try e With xn(x) -> e' --> v'

... plus rules to bubble Raise xn(v) values up as Return v were bubbled up, including
e --> Raise xn(v)
-----------------------------
e - e' --> Raise xn(v)

Bubbling a Raise through a Raise also requires rules (this case will hardly ever arise in practice):
e --> Raise xn(v)
-----------------------------
Raise e(e') --> Raise xn(v)

e --> v, e' --> Raise xn(v)
-----------------------------
Raise e(e') --> Raise xn(v)

Reconsider the above example.
(Function x ->
  Try (If x = 0 Then 5 Else Raise return(4 + x)) - 8
  With return(n) -> n) 4
Try (Raise return(4 + 4)) - 8
    With return(n) -> n 
is
Try (Raise return(8)) - 8
    With return(n) -> n 
And by bubbling, it suffices to compute
Try Raise return(8)
    With return(n) -> n 
which by Raise, returns value
8

Exceptions as presented above are almost identical to the Caml form.

Efficiently Implementing Exceptions

(we didn't cover this topic in lecture)

The "bubbling up" method of interpretation is correct, but in practice it is very inefficient. For instance, the -, etc operations will always have to check if either argument is of the form Return ... and act appropriately, greatly slowing down code.

Better interpreters can be written that get around this problem, but for lack of time we address the problem in the context of compilers only.

We sketch how the DSR compiler could be extended to a DSRX compiler. Exceptions are still "bubbled up", but we can take much bigger steps, immediately popping back to the nearest enclosing Try.
The closure conversion, A-translation, and function hoisting algorithms can be extended to exception constructs, with new tuple forms

x = [ let tuples ] Try xn (y) -> [let tuples ]
x = Raise y (* y a variable since body of Raise was A-translated *)
x = Exception
resulting.

The toC function can then be extended as follows.
Note, int exception_count and jmp_buf nearest_Try are global C variables.
< setjmp.h> C library is used to return from a deeply nested function call via setjmp/longjmp.
This library is a primitive nonlocal control operation.

Semantics of setjmp/longjmp: executing setjmp(aTry) is something like the fork library function in C:

The code uses a global variable nearest_Try which at the current point of execution contains the nearest enclosing Try statement.
toCTuple(x = Let Exception xn In e) = xn = global_exception_count++;
  toCtuples(e)

toCTuple(Raise xn(y)) = longjmp(nearest_Try,"[xn,y]") /* immediately pop off function activations,
       returning to nearest enclosing setjmp, i.e. Try */

toCTuple(x = [ let tuples 1 ] Try xn (y) -> [ let tuples 2 ]) =
  { jmp_buf next_nearest_Try;
    Word val;
    next_nearest_Try = nearest_Try;
    /* setjmp call below marks this point in code as a future possible return point */
    /* nearest_Try is set to be this location in the code by the setjmp call */
    /*val is 0 when the code is normally executed, but if it is returned to, it is an argument */
    if ((val = setjmp(nearest_Try)) == 0) 
       { /* 0 return indicates jump point was just set */
        toCTuples([let tuples 1]);
        nearest_Try = next_nearest_Try; /* pop the stack of Trys */}
    else 
       { /* non-0 indicates a longjmp brought us back to this set point */
        nearest_Try = next_nearest_Try; /* pop the Try stack */
        "[xn_raised,y]" = val; /* the setjmp value above, if not zero, is the value passed to longjmp */
        if  (xn == xn_raised) /* exception raised matches this Try */
         { toCTuples([let tuples 2]) }
        else /* exception is not handled here; try next handle further up the stack */
           longjmp(nearest_Try,val);
      }
The setjmp/longjmp code is unnatural syntax, but it gives enough of a primitive nonlocal control effect to allow ML-style exceptions to be implemented.

This implementation only requires "bubbling" through all enclosing Try statemements that do not match; other bubbling steps are skipped.

Question: What will the above compiler do if an uncaught exception is raised?? How could a better solution to this be devised?