600.211: Unix Systems Programming

Fall Semester 2005: September 8, 2005 - December 12, 2005

Assignment 1: Fun with Hash Functions

Out on: September 12, 2005
Due by: September 18, 2005 by 5:59 pm for full credit (11:59 pm for 10% off, hard deadline)
Collaboration: None
Grading: Packaging 10%, Style 10%, Performance 10%, Design 20%, Functionality 50%

Overview

The first assignment for 600.211: Unix Systems Programming is mostly a warmup exercise. You will have to (re-)learn the C language, including some of the trickier features like function pointers. You will also have to get familiar with the gcc compiler, basic file processing using the standard C library, some Unix system calls for configuration and time, and several pervasive Unix tools such as tar, gzip, and make. Convince us that you're in the right course!

Details

Your task is to develop a small framework for empirically evaluating the quality of various hash functions. The set of hash functions is closed ("compiled in") so you won't have to worry about accepting hash functions as input. Here is what your program hashtest should do:

  1. Read a sequence of strings seperated by newlines from one of the following sources:
  2. For each string:
  3. Generate summary statistics comparing hash functions

For example, invoking ./hashtest manifesto.txt -s 128 --nam -p will use manifesto.txt in the current directory as input, use a hash table with 128 entries, and print results sorted by name followed by results sorted by performance.

Note that this is just one particular (read: arbitrary) way we can go about evaluating hash functions; for certain applications, e.g. cryptography, other characteristics might be much more important.

The Hash Functions

Here are the hash function you must support, but you're free to add more hash functions beyond this set if you are interested. Please use the names given here to identify these hash functions in your output.

There is not much use in you re-implementing hash functions, so just download the relevant code from the sources given and integrate it with your framework. Watch out for license problems though and give proper credit.

Hints

Deliverables

Please turn in a gzip compressed tarball of your assignment (the extension should be .tar.gz). The tarball should uncompress into a directory cs221-assignment-1-login with login replaced by your Unix login name (so I would use cs211-assignment-1-phf); uncompressing should not create any other files in the current directory. The tarball should contain no derived files whatsoever, but allow building all derived files with make. We expect that your Makefile handles "the usual" targets like clean and test aside from all (which is the main way we will build your program). Include a README file that briefly explains what the program does and contains any other notes you want us to check out before grading. Include other "common" files such as INSTALL describing how to install your tool, CREDITS to pay your respects to the people whose code you're reusing, and LICENSE to describe copyright and distribution terms if you feel like it. You can look at any number of "famous" open source projects to see what kind of structure is appropriate; gif2png is a relatively small example, but you don't need everything in there. Aside from your code, what you really need is a README and a Makefile that works. :-)

Grading

For reference, here is a short explanation of the grading criteria. Packaging refers to the proper organization of the stuff you hand in, following the guidelines for Deliverables above. Style refers to C programming style, including things like consistent indentation, appropriate identifiers, useful comments, etc. Simple, clean, readable code is what you should be aiming for. Performance refers to how fast your program can produce the required results compared to other submissions. Design refers to proper modularization and the proper choice of algorithms and data structures; often this can be judged by asking "How hard would it be to add feature X?" and "How hard is it to replace algorithm X with algorithm Y?". Functionality refers to your program being able to do what it should according to the specification given above; if the specification is ambiguous and you had to make a certain choice, defend that choice in your README file.

If your program cannot be built you will get no points whatsoever. If your program cannot be built using make we will take off 10%. If your program cannot be build without warnings using gcc -W -Wall -O -ansi -pedantic we will take off 10% (except if you document a very good reason). If your program fails miserably even once, e.g. segfaults or runs forever, we will take off 10%.

Bonus Feature

If you really want to impress us, add some kind of "cool" bonus feature to your program. We won't give you extra points, but we'll give you extra kudos. :-) Be sure to point out any bonus features you have in your README file.

Updated: $Id: assignment-1.html 30 2005-09-12 12:36:36Z phf $ Validate: XHTML CSS
Copyright © 2005 Peter H. Fröhlich. All rights reserved.