Spring Semester 2006: January 30, 2006 - May 5, 2006
Out on:
March 16, 2006
Due by:
March 30, 2006 by 5:59 pm for full credit (11:59 pm for 10% off, hard deadline)
Collaboration:
Pairs
Grading:
Packaging 10%, Style 10%, Performance 10%, Design 20%, Functionality 50%
The seventh assignment for
600.226: Data Structures
deals mostly with sets, orders, priority queues, and related concepts.
There are some "written" problems as well, to be answered in the
README file.
BTW, the assignment is titled "Setting Priorities" for a reason, I'll
clarify that on the discussion list if necessary...
Note that each pair hands in one assignment!
Decide early on who is going to be responsible for submitting the
assignment and when.
Make sure to include all the relevant information (who is in the
pair?) in your README file!
Both of you will get the same score for the assignment.
Here are the necessary interfaces and exception classes: sets.tar.gz As usual, you are not allowed to change the code we provide in any way! Warning: This is a new version of the assignment and there may be serious bugs in these interfaces. If you think you found a bug, please email the course staff about it immediately. Thanks!
Your first task is to write a class
SimpleSet<T>
that implements the
Set<T>
interface we provided above.
You are free to use the Java classes
java.util.List<T>,
java.util.ArrayList<T>,
and
java.util.LinkedList<T>.
Of course you can also hack your implementation
"from scratch"
if you prefer, but that will make things somewhat
more tedious when it comes to iterators...
As usual, please provide a toString()
method to return a String representation
of the set, and a main() method that
performs basic unit testing for your implementation.
A new set into which the elements 1, 2, and 3 were
inserted should print as
{1, 3, 2}
or something close; the order of elements is not
defined.
Make sure that your unit tests cover the "set semantics"
we discussed in class, e.g. removing an element that is
not in the set doesn't change the set.
Describe the data structure you used for your implementation
in your README file and discuss the asymptotic
complexity for each operation of the Set<T>
interface. Also, explain why the data structure you picked is
a good one compared to other implementation options.
Your second task is to write a class
SimpleOrderedSet<T>
that implements the
OrderedSet<T>
interface we provided above.
Once again you are free to use the Java classes
java.util.List<T>,
java.util.ArrayList<T>,
and
java.util.LinkedList<T>.
The only other constraint is that your implementation
must
support the has()
operation in
O(log n)
worst-case time!
As usual, provide a toString() method to
return a String representation
of the set, and a main() method that
performs basic unit testing for your implementation.
The set should print its elements in sorted
order, so {1, 2, 3} for the set from Problem 1
above.
Discuss the relationship between
Set<T>
and
OrderedSet<T>
in your
README
file.
Is it a good idea that these interfaces are related?
How "related" are your implementations,
SimpleSet<T>
and
SimpleOrderedSet<T>?
Did you have to duplicate code that you wish could
be put in one place instead?
Can you suggest a better way to organize these
interfaces and implementations?
No doubt you remember the Numbers.java program
from the very first assignment?
You're about to do that one again, but (probably) using a
somewhat more sophisticated approach.
In fact, we've given you Numbers.java already,
you'll concentrate on implementing suitable data structures
instead. :-)
Your task is to write two implementations
of the
MultiSet<T>
interface we provided above.
The first, called
SimpleMultiSet<T>,
is just a variation on the two set implementations
you already hacked for Problems 1 and 2.
The second, called
FancyMultiSet<T>,
should be self-organizing, but it's up to you what heuristic
to use for this (transpose, move-to-front, something else).
As usual, provide a toString() method to
return a String representation
of each multiset, and a main() method that
performs basic unit testing.
The "real" test, however, is using your classes as part of
the Numbers.java program.
Obviously you'll have to change the program in minor ways
to try out your two implementations: Feel free to do so,
just make sure you're not breaking anything... :-)
You should design and perform a number of experiments to
evaluate which
MultiSet<T>
implementation performs better for what kinds of input
in the Numbers.java application.
Describe the experiments and present your results in your
README file.
Finally, take a look back at all the interfaces for sets
we had so far and discuss how they could be "refactored"
into a coherent hierarchy. If that's not possible, discuss
what problems prevent us from forming such a hierarchy.
Your fourth and final task for this assignment is to
write two implementations of the
PriorityQueue<T>
interface we provided above.
The first, called
SimplePriorityQueue<T>,
should be quite similar to
SimpleOrderedSet<T>
from Problem 2 above.
The second, called
HeapPriorityQueue<T>,
should be based on the heap data structure
discussed in class, but it's up to you whether
to implement the heap in terms of an array-based
binary tree or a linked binary tree.
However, the heap-based implementation must
support
top() in O(1),
insert() in O(log n),
and
remove() in O(log n)
worst-case time.
As always, provide a toString() method to
return a String representation
of each queue, and a main() method that
performs basic unit testing.
The "real" test, however, is using your classes as part of
the Sort.java program we provide.
Obviously you'll have to change the program in minor ways
to try out your two implementations: Feel free to do so,
just make sure you're not breaking anything... :-)
You should design and perform a number of experiments to
evaluate which
PriorityQueue<T>
implementation performs better for what kinds of input
in the Sort.java application.
Describe the experiments and present your results in your
README file.
Finally, discuss how
PriorityQueue<T>
could (or should) be related to the "basic"
Queue<T>
interfaces from Assignment 2.
Any problems?
A simple way to measure how long a program runs is the time(1)
tool. For example, to get timing information for Sort.java
you can do the following:
phf@peregrine(sets)> time java Sort <input >output 0.47u 0.09s 0:00.60 93.3%
Check the man page for time(1) to find out what those numbers
mean and what options you can pass to the tool. You can also use the Java
profiler as described on an earlier assignment.
Please turn in a
gzip
compressed
tarball
of your assignment;
the filename should be
cs226-assign-7-login1-login2.tar.gz
with login1 and login2
replaced by your Unix login names on ugradx.cs.jhu.edu.
The tarball should contain no derived files whatsoever
(i.e. no .class files, no .html files, etc.),
but allow building all derived files.
Include a README file that briefly explains what your
programs do and contains any other notes you want us to check out
before grading; don't forget to include your answers to "written"
problems as well.
For reference, here is a short explanation of the grading criteria.
Packaging refers to the proper organization of the
stuff you hand in, following the guidelines for Deliverables above.
Style refers to Java programming style, including
things like consistent indentation, appropriate identifiers,
useful comments, suitable javadoc documentation, etc.
Simple, clean, readable code is what you should be aiming for.
Performance refers to how fast your program can
produce the required results compared to other submissions.
Design refers to proper modularization and the
proper choice of algorithms and data structures.
Functionality refers to your programs being
able to do what they should according to the specification
given above; if the specification is ambiguous and you had
to make a certain choice, defend that choice in your
README file.
If your programs cannot be built you will get no points whatsoever.
If your programs cannot be built without warnings using
javac -Xlint
we will take off 10% (except if you document a very good reason).
If your programs fail miserably even once,
i.e. terminate with an exception of any kind,
we will take off 10%.
Books on data structures often come with lots of pictures that
illustrate how a certain data structure is "maintained" as certain
operations are performed.
Add code to your implementation of
HeapPriorityQueue<T>
that produces DOT output illustrating how the heap data structure
evolves as operations are performed on it; can you illustrate
the "bubble-up" and "bubble-down" processes as well as just
the final result of an insert() call?
As always, we won't give you extra points for this,
but we'll give you extra kudos. :-)