600.226: Data Structures

Fall Semester 2005: September 8, 2005 - December 12, 2005

Assignment 5: Trees of Wonder

Out on: October 6, 2005
Due by: October 12, 2005 by 5:59 pm for full credit (11:59 pm for 10% off, hard deadline)
Collaboration: None
Grading: Packaging 10%, Style 10%, Performance 10%, Design 20%, Functionality 50%

Overview

The fifth assignment for 600.226: Data Structures deals mostly with trees and their applications. Actually that's not quite right: It deals with a "tree-ish" thing, you'll see... There are no written problems, but of course you're free to rant about something in your README file anyway.

Fun with Directories

As mentioned in lecture, the typical UNIX file system has the shape of a tree, with directories as internal nodes and files as external nodes. It's pretty simple to get access to the directory structure in Java, the following example shows how to get a listing of the current directory (denoted by "." in Unix):

import java.io.File;

public class Dir
{
  public static final void main( String[] args )
  {
    File f = new File( "." );
    String[] s = f.list();
    for (int i = 0; i < s.length; i++) {
      System.out.print( s[i] );
      File h = new File( s[i] );
      if (h.isDirectory()) {
        System.out.println( "/" );
      }
      else {
        System.out.println( "" );
      }
    }
  }
}

Take a look at the API documentation of the java.io.File class to get an idea of the other methods you can use to deal with the UNIX file system. I recommend that you explore those methods by starting with the example above and adding various features to it until you have a good feeling for the API.

Your task is to write a new version of Dir that produces a recursive directory listing. You start at the path the user supplies as the first command line argument; if no path is given you start in the current directory. For example, if your program is invoked as java Dir /usr/bin then you should start your listing at the directory /usr/bin. What to do? Well, you start by getting a list of entries in that first directory; now you need to go through this list and find further directories, for which you need to request lists of their entries in turn; and so on, and so forth.

You should "record" the directory layout you find in some kind of "tree-shaped" data structure, and you are free to design that data structure in whatever way you want. Once you are "done" and have a complete representation of the directory structure, it's time to generate your output. You should produce simple, textual output that shows the directory structure ("nesting") by indenting appropriately, like this:

some_file
some_other_file
a_first_directory/
  some_other_file
a_second_directory/
  a_third_directory/
    yet_another_file
some_last_file

Note that the contents of the "top" directory are not indented, but the contents of further directories are. For each "level" you should use two spaces of indentation.

Hint

There is one slightly misleading statement in the above description. If you're keen on minimizing your work, read the instructions two or three times before starting out.

Deliverables

Please turn in a gzip compressed tarball of your assignment (the extension should be .tar.gz). The tarball should uncompress into a directory cs226-assignment-5-login with login replaced by your Unix login name (so I would use cs226-assignment-5-phf); uncompressing should not create any other files in the current directory. The tarball should contain no derived files whatsoever (i.e. no .class files, no .html files, etc.), but allow building all derived files. Include a README file that briefly explains what your programs do and contains any other notes you want us to check out before grading (and of course your answers to "written" problems).

Grading

For reference, here is a short explanation of the grading criteria. Packaging refers to the proper organization of the stuff you hand in, following the guidelines for Deliverables above. Style refers to Java programming style, including things like consistent indentation, appropriate identifiers, useful comments, suitable javadoc documentation, etc. Simple, clean, readable code is what you should be aiming for. Performance refers to how fast your program can produce the required results compared to other submissions. Design refers to proper modularization and the proper choice of algorithms and data structures. Functionality refers to your programs being able to do what they should according to the specification given above; if the specification is ambiguous and you had to make a certain choice, defend that choice in your README file.

If your programs cannot be built you will get no points whatsoever. If your programs cannot be built without warnings using javac -Xlint we will take off 10% (except if you document a very good reason). If your programs fail miserably even once, i.e. terminate with an exception of any kind, we will take off 10%.

Bonus Problem

The bonus problem asks you to replace the simple, textual output above with graphical output of the directory structure. Actually, you still output text, but in a format that can be used by another program called dot to draw a nice tree for the directories. You can find plenty of information on dot by doing man dot on one of the undergraduate Linux machines, or you can read up on all the details here. Here is an example output in dot format:

strict graph Directory {
 rankdir=BT;

 0 [ label="./", shape=rectangle ];
 1 [ label="some_file" ];
 2 [ label="some_other_file" ];
 3 [ label="a_first_directory/", shape=rectangle ];
 4 [ label="some_other_file" ];
 5 [ label="a_second_directory/", shape=rectangle ];
 6 [ label="a_third_directory/", shape=rectangle ];
 7 [ label="yet_another_file" ];
 8 [ label="some_last_file" ];

 1 -- 0;
 2 -- 0;
 3 -- 0;
 4 -- 3;
 5 -- 0;
 6 -- 5;
 7 -- 6;
 8 -- 0;
}

There are two main sections: the first one defines the nodes (rectangles for directories and ellipses, the default shape, for files), the second one defines connections between nodes. The dot language is a little bizarre, but it's not too bad. Here's an example image, the kind that dot will generate for you: [Sample Image] As you noticed, this image does not look very nice, but that's a limitation of the GIF backend; images in PDF look pretty neat.

Note that you have to "make up" a node for the directory you started from (here "./" for the current directory), otherwise the picture wouldn't look very nice. Your dot code should go to standard ouput, so you can use java Dir | dot -Tps | ps2pdf - >out.pdf to produce a PDF file; a GIF file is even easier, you can make it with java Dir | dot -Tgif >out.gif instead. Enjoy!

Updated: $Id: assignment-5.html 141 2005-10-07 02:56:38Z phf $ Validate: XHTML CSS
Copyright © 2005 Peter H. Fröhlich. All rights reserved.