Homework 5: Benchmarking Sets and Heaps

Homework is important practice for you, but it’s not graded so you don’t have to submit it. This particular homework has you implement, test, and benchmark a variety of set data structures as well as a priority queue.

Experimental Analysis (Performance, Benchmarking)

You’ll perform a lot of experimental analysis for this homework: You’ll run some code and you’ll measure how it performs. You’ll work with jaybee as well as with new incarnations of the old Unique program and the xtime script. Think of the former as “unit benchmarking” the individual operations of a data structure, think of the latter as “system benchmarking” a complete (albeit very small) application.

It is very important that you run your benchmarks under “identical” conditions to make them comparable! All your benchmarks should be run on the same (virtual) machine, using the same Java version, and with as little load (unrelated programs running, other users logged in, etc.) as possible. If the load on your machine is too high, the variance of your measurements will increase making your results less reliable and hence less useful.

It is equally important that you run your benchmarks multiple times to rule out embarrassing outliers! Even if you keep the load on the machine low, there will still be some variation from one execution of a benchmark to the next. At minimum you should repeat your measurements often enough to get three to five that show about the same performance. Ideally you then average those measurements and indicate their variance as well. For example, if you measured 0.29, 0.33, and 0.30 seconds, you should average that to 0.31 seconds and report it as 0.31 (± 0.02).

Profiling (Optional)

If you’re curious about where the Unique programs spend most of their time, you can use the Java profiler to figure out where the “hot spots” in those programs are for a given input file. You can find examples for how to do this in Peter’s old lecture notes PDF, but since we’re not “pushing” the profiler this semester this is strictly optional.

Use the timing profiler whenever possible because it will give you a more accurate picture of what’s going on in the code; if that would take too long, use the sampling profiler to at least get a rough idea of what’s going on. You may want to create additional test data (small enough to use the timing profiler) as well. What you should be looking for is the cumulative percentage of time spent in the set operations.

Unit Testing (Correctness)

While it’s not mentioned explicitly below, but you’ll also want to put together the usual JUnit 4 test drivers for the various interfaces and classes. Despite the homework focusing on performance issues, correctness is still the most important quality. Benchmarks of incorrect implementations are useless! The only straightforward way to demonstrate (both to yourself and to anyone else) that your code works is to have extensive unit tests for it. Remember to structure your tests as base classes for interface testing and subclasses for instantiation and implementation testing! Be sure to test all methods and be sure to test all exceptions for error situations as well.

Problem 1: Warming Up

You won’t have to write data structure code for this first problem, instead you’ll collect some important baseline data you’ll need for the following problems. We have provided two implementations of the Set<T> interface for you on Piazza: ArraySet<T> and ListSet<T>. These are the same implementations (at least for the most part) that we discussed in lecture, but you should still read the code to get a good handle on how they work. You will benchmark these two set implementations in two different ways:

You can generate the data sets for the second part using the makedata.py Python script we also provide on Piazza; read the long comment at the top of that program to figure out how to run it. Make sure that you follow these guidelines for your data sets:

If you wish, you can also vary the range of integers in each file to get a third dimension to evaluate, but if you don’t feel like doing that just use a range of 1000000 for each.

Reflection: Put the benchmark data you collect for this problem in your README file and describe your observations. Include details about the data sets you used for your benchmarks. Also try to explain your observations using your understanding of the code you’re benchmarking. Discuss why the numbers come out the way they do!

Problem 2: Heuristics Ahead

Alright, now you get to write some code. You’ll modify the two implementations of Set<T> we provided to use the heuristics we discussed in lecture:

Note that you should not subclass ListSet<T> or ArraySet<T>. Once you are reasonably sure that your new “adaptive” set implementations work (passes your test cases) collect all the information you collected in Problem 1 again for the new implementations:

Reflection: Put the benchmark data you collect for this problem in your README file and describe your observations, especially in comparison to your results from Problem 1. Also try to explain your observations using your understanding of the code you’re benchmarking. Discuss why the numbers come out the way they do!

Problem 3: Queuing Priorities

For the last problem we leave sets behind and look at the PriorityQueue<T> interface instead. As discussed in lecture, the semantics of priority queues and sets are very different indeed, so why bother having them on this homework? It turns out that Unique using a priority queue is a lot faster than using any of the set implementations we’ve seen so far, and we figured you should be exposed to that.

The big semantic difference between priority queues and sets is that if we insert “X” into a set three times and remove “X” once, it’s gone; in a priority queue “X” would have to be removed three times before it’s gone. You’ll have to write a new version of Unique called UniqueQueue that takes this difference into account: You can still insert every number as it comes along, however when you remove them to print the output you have to be careful not to print the same number repeatedly. That would defeat the whole purpose of Unique after all!

On Piazza, we provide a (bad!) implementation of the PriorityQueue<T> interface called SortedArrayPriorityQueue<T>. We do this mostly so you have an example for the way you should be dealing with the Comparator<T> objects, but also to give you something you can measure your own implementation against. You probably still need to read up on comparators and how they work!

You will implement BinaryHeapPriorityQueue<T> using the binary heap data structure described in lecture. It’s your choice whether you want to use a plain Java array or the ArrayList<T> class from the Java library for your implementation. You must write two constructors for your implementation — one that uses a default Comparator as described here, and one that has a Comparator parameter. If a client creates a BinaryHeapPriorityQueue<T> with no comparator, the best and remove methods should operate on the smallest element in the queue, not the largest element.

After you are reasonably sure that your PriorityQueue<T> implementation works (passes your test cases), collect all the information you collected in Problem 1 again:

Just to be clear, you should collect performance data for both of the PriorityQueue<T> implementations, our SortedArrayPriorityQueue<T> and your BinaryHeapPriorityQueue<T>.

Reflection: Put the benchmark data you collect for this problem in your README file and describe your observations, especially in comparison to your results from Problems 1 and 2. Also try to explain your observations using your understanding of the code you’re benchmarking. Discuss why the numbers come out the way they do!