600.226 Data Structures (Spring 2004 - Prof. Eisner)

Homework 5 ... due 4pm Friday 5 March 2004

Summary

Goal of this homework: You'll implement some methods for building and traversing trees, and will learn to think carefully about handling input errors. You'll also have to show a little more independence in this assignment, while continuing to follow good design practices!

Collaboration policy for this homework: You must follow the CS Department Academic Integrity Code and this time may not collaborate with another student (in contrast to the previous homework). Of course, you may get help from the course staff.

How to hand in your work: instructions, upload page

What to hand in (a directory, all packaged up in a zip or tar file):

Acknowledgments: Many thanks to Josh Dunkelman and Prof. Jonathan Cohen for their help in preparing this assignment.

Introduction

How does your Java compiler (javac) make sense of expressions like ((3+4)*((5^6)/7))? In this assignment, you will write a constructor that turns an expression from a string into a tree. You will also write methods that operate on the tree. These methods will find the value of the expression (15625) and turn the expression back into a tree in preorder, inorder, and postorder forms.

You are provided with a ViewTree class to test your methods. The command java ViewTree '((3+4)*((5^6)/7))' will bring up the following window. The drawing is produced by our ViewTree code (using a technique explained in the textbook). But the values in the four lines at the bottom are produced by calling your code.

It is possible that you will extend this code in a later assignment, for example, to allow expressions with variables in them.

Parenthesized Expressions as Tree Input

The expressions input to your program should appear as a form of infix notation. However, you only have to deal with fully parenthesized expressions. Thus, 5+2*4 is not a legal input; it must be written as (5+(2*4)). This saves you from having to deal with operator precedence rules (otherwise known as "order of operations").

The legal inputs can be defined recursively as follows:

Some examples of valid expressions:

When encountering an invalid expression, you should throw an InvalidExpressionException whose string argument is a clear error message. This string should tell the user what problem you encountered and where you encountered it. Your error messages should be comparable in their level of detail to the following, though yours might be different (depending on how your solution realizes that there's an error):

Implementing Binary Trees

You should start by writing a class (you can choose the name) that implements the following BinaryTree interface:

public interface BinaryTree {
    public Position root();
    public Position parent(Position v);     // should return null if no parent (root)
    public Position leftChild(Position v);  // should return null if no left child
    public Position rightChild(Position v); // should return null if no right child
    public boolean isRoot(Position v);
    public boolean isExternal(Position v);
    public boolean isInternal(Position v);
    public void createExternal(Position v); // v is an external node; give it left
					    // and right children with null elements,
                                            // thereby making it internal
    public void replaceElement(Position v, Object e);  
}

You may implement the above BinaryTree interface in any way you choose (you are welcome to use and modify any code from the course textbook, but not from other sources). The nodes should implement the Position interface in the book (supporting the element() method). Thus you may choose to use either nodes with parent/child reference links or else a vector-based (heap-style) implementation as described in the textbook.

Here is all the code in the textbook so that you don't have to type it in yourself. However, you will have to figure out which code to use. Also, our interface differs slightly from the textbook's -- the parent, leftChild, and rightChild methods return a null Position instead of throwing a BoundaryViolationException -- and you will have to follow our interface since that's what ViewTree expects.

Implementing Expressions

Now you should write an TreeExpression class that implements the following Expression interface:

/** Represents a arithmetic expression.  */

public interface Expression {
  /** Converts to a string in prefix notation. 
   * This can be accomplished by a preorder traversal of the tree. */
  public String prefix();

  /** Converts to a string in infix notation. */
  public String infix();

  /** Converts to a string in postfix notation. */
  public String postfix();

  /** Finds the value of the expression. */
  public double value();
}

Your TreeExpression class should implement not only Expression but also BinaryTree, presumably by extending your binary tree class. That is so that when the ViewTree class wants to display an expression's tree, it will be able to call methods of BinaryTree on the expression.

Remember that TreeExpression extends your binary tree class. That means that every TreeExpression actually is a binary tree (augmented with some extra methods like infix(), so that it also implements Expression). It doesn't have to contain another binary tree.

Your TreeExpression class should have a constructor that takes a string argument such as "( (5+ 8)*( 3 +5 ))". ViewTree will call this constructor. If the string is invalid, the constructor should throw an InvalidExpressionException that contains an appropriately detailed error message as described above. This is the only exception that ViewTree knows how to catch, so you should be careful not to throw any others:

/** Thrown in response to a string that is not a legal 
 * arithmetic expression.  The String argument may be shown
 * to the user, so it should clearly explain what and where 
 * the trouble is.
 */

public class InvalidExpressionException extends Exception {  
  public InvalidExpressionException(String err) {
    super(err);  // just call the general Exception constructor
  }
}

Note that this is an ordinary checked exception, meaning that any method that throws the exception must declare throws InvalidExpressionException in its signature. This reminds a programmer using the method that it might throw this exception, and it forces her to either handle it (with catch) or explicitly declare that she would throw it further up. Although we have been using some unchecked exceptions in recent assignments (this is done by extending RuntimeException instead of Exception), they are rare - you should only use them for exceptions that most callers wouldn't want to handle. Here's some discussion of this point.)

You'll probably want to test your constructor right away using the ViewTree class. So write temporary stubs for the other methods. (These stubs should just return bogus values like "unknown" and 0.) Then you should at least be able to see the tree.

Once your constructor is working properly, you should fill in the other methods of the interface. There is some discussion in the textbook about how to do this. Now test those methods using ViewTree, too.

Of course, you are not a helpless slave of the ViewTree class: we just gave it to you to be nice. If things don't work, or if you want a more thorough test than ViewTree provides, then write your own test code. For example, once you've written TreeExpression.prefix(), you can and should call it from your own test method TreeExpression.main().

Note: If you get a NullPointerException from TreeArea, it is probably thrown by p.element().toString(), in which case your problem is either that p==null or p.element()==null. Here p is a Position that ViewTree has found by exploring your TreeExpression, starting with p=te.root() and recursing to both te.leftChild(p) and te.rightChild(p) as long as te.isInternal(p). Obviously, ViewTree must explore the tree like this in order to find and draw all the positions (nodes). But it should not be able to find a null position in this way, or a position with a null element. If it does, there is surely something wrong with your code, resulting in that NullPointerException. The best way to figure out what is going wrong is to write your own test code that also explores the tree, and see where your code gets into trouble.

Possible Strategies for Expression Parsing

The positions of your tree store Objects as their elements, of course. But what kind of objects? You should probably store a Double at each external node, and a Character such as + at each input node.

(You should remember from homework 3 how to convert between char and Character. Converting between double and Double is similar. These classes are in the java.lang package; refer as necessary to their documentation.)

How do you convert a fully parenthesized expression string into a tree? If the string is legal, one of these two cases must hold. (a) The string is a number, in which case you can just return a one-node tree. (b) The string starts with a left parenthesis, in which case it must have the form (s1 c s2) where c is an operator character. There are at least three possible strategies:

  1. In case (b) above, you can scan the string to find the operator character, going left to right and counting +1 for left parentheses and -1 for right parentheses. Now use the substring() method to extract the two subexpressions that are arguments to that operator. Recursively build the left and right child subtrees out of these subexpressions.

    This is not as inefficient as it may seem, since creating a substring takes O(1) time. It doesn't require any copying of characters; internally, a substring is represented as a pointer to the original string together with a pair of integers that indicate the extent of the substring.

    But it is nonetheless a somewhat time-inefficient strategy: each character will be repeatedly scanned (though not copied), once for every expression it's part of.

    If you use this strategy, be especially careful to handle whitespace. Also note that your error messages are likely to look a bit different from the examples given earlier.

    This strategy also has to be careful about minus and plus symbols. The - symbols in -3.45 and 3.45e-67 are not subtraction operators, and the + symbol in "6.324e+05" is not an addition operator. Probably the easiest solution is to look at the character before the minus or plus symbol (ignoring whitespace); that should let you figure out what kind of minus or plus it is.

  2. You can write a protected method eatExpression, which removes the smallest complete expression from the front of the string and returns a tree for it. eatExpression can be implemented recursively. If the string is just a number, you return a 1-node tree. If it has the form (s1 c s2), then you do 5 steps: eat a left parenthesis, recursively eat a subexpression s1, eat an operator character c, recursively eat a second subexpression s2, and finally eat the right parenthesis. You can combine the operator with the two subexpression trees to get a tree to return.

    If any of these steps go wrong, it signals that something is wrong with the input string, so you should throw an exception with a detailed error message. For example, you might be ready to eat an operator character, but there isn't one at the front of the string for you.

  3. The previous two techniques use recursion. Recursion is a very nice way of keeping track of where you are and how to assemble the results. After all, when a method call returns, its caller remembers where to put the result in the tree and how to continue with its own computation.

    But you can also solve the problem without recursion. The idea is to eat the string from left to right, building the tree as you go. Your "current node" in the tree keeps track of where you are in the computation. If you eat a number, it goes at the current node. If you eat a left parenthesis, you make the current external node internal (by giving it children with createExternal()), and descend to the new left child, which becomes the current node. If you eat an operator, then ... well, you get the idea.

    Again, if any of these steps go wrong, it signals a problem and you should throw an exception with a detailed error message. For example, the input string might inappropriately "tell" you to add children to a node that is already internal, or add an element to a node that already has one, or ... what else?

Regardless of which strategy you choose, your error messages should be designed to be useful to the user. Don't just report something like can't add children to an internal node. Tell the user what is wrong with the string so that he/she knows how to fix it!

Resources You Can Use

Download the six ViewTree .class files to your current directory or CLASSPATH directory. (Or download the single zipfile containing them.)

As already mentioned, you can use code from the textbook.

Use String.charAt() to extract a single char from a string, and String.substring() to get an arbitrary-length substring.

The String.trim() method is very helpful for eliminating whitespace at the beginning or end of a string. Or just use the lower-level method Character.isWhitespace() as a portable way to find out whether a single character is a whitespace character (typically space, tab, or newline).

You might find the StringBuffer class useful for efficiently building up a long string, for example for the prefix, infix, and postfix methods. (You can get away without it, though.)

You can convert a string to a double using Double.parseDouble(). But two of the strategies in the previous section require you to "eat" characters and doubles from the front of a string. This means you must remember your current position in the string (i.e., how much have you eaten so far?). Here are three options:

Extra Credit

Fully parenthesized expressions make things easier for you, but harder for the user. The point of this extra credit challenge is to deal with strings that are not necessarily fully parenthesized. As before, you want to convert back and forth between strings and expression trees.

Rename your old TreeExpression class (and its file) to FullParenTreeExpression. Now write a new TreeExpression class that extends FullParenTreeExpression. The new class should behave differently in two ways:

You can test your improved TreeExpression class with ViewTree as before. If you only handle one of the two bullet points above, you will still get extra credit; more if you handle both.


Jason Eisner - jason@cs.jhu.edu - $Date: 2004/03/08 07:18:31 $