Spring Semester 2008

January 28, 2008 – May 2, 2008

Assignment 4: All your Data in our Base!

Out on: February 18, 2008
Due by: February 25, 2008, 3:00 pm (before lecture)
Collaboration: None
Grading: Packaging 10%, Design 10%, Style 20%, Functionality 60%

Overview

The fourth assignment asks you to implement a very simple database system that can store keys and their associated values persistently on disk. It is the most programming-intensive assignment so far, and I highly recommend that you start working on it as soon as you can. You won't be able to finish it if you wait until the last day before it's due!

The Interface

This is where you can get the latest version of the interface you need. Please watch the mailing list carefully for changes to this!

The Examples

This is where you can get some simple example programs using the interface. Feel free to ask questions about these on the mailing list!

Problem 1: Interface Critique (20%)

The simple database you are implementing needs to conform to the well-defined interface provided above. This problem asks you to write a short critique of that interface.

It's probably easier to do this problem after you have already tried to implement the interface. Are there operations we missed? Could we have made the interface simpler to use or implement? Do you think the way the interface is defined could be improved? Does the interface design cause performance problems? These are the kinds of questions you want to address, but you will certainly be able to think of more.

If you have a better way to define the interface, make sure to include a draft of your suggested better_sdbm.h file.

Problem 2: Database Implementation (80%)

The simple database you are implementing needs to conform to the well-defined provided above. Do not change the interface for any reason! If an error needs to be corrected, we will post a new version of the interface with a new checksum!

It's easy to describe your task: Implement the interface! :-) Of course there are lots of decisions you'll need to make in order to do that, most importantly how you will actually store the database on disk. A relatively simple and popular way is to use two files:

We just included this description as an example, you can of course choose a completely different way to organize things. The whole point of defining an interface is that clients of the database don't have to worry about how the database works internally, so in that sense implementation details are not really important. What is important is that you actually store data on disk and not just in memory: If one program writes a new key/value pair to the database and exits, and another program is started that looks for the same key, it should find the value stored there by the previous program.

Please put your implementation into a file sdbm.c and make sure that sdbm.o builds by itself so it can be linked to the separately compiled test programs.

Hints

Deliverables

Please turn in a gzip compressed tarball of your assignment; the filename should be cs120-assign-4-login.tar.gz with login replaced by your Unix login name on ugradx.cs.jhu.edu (so I would use cs120-assign-4-phf.tar.gz). The tarball should contain no derived files whatsoever (i.e. no executable files), but allow building all derived files. Include a README file that briefly explains what your programs do and contains any other notes you want us to check out before grading.

Grading

For reference, here is a short explanation of the grading criteria. Packaging refers to the proper organization of the stuff you hand in, following the guidelines for Deliverables above. Style refers to C programming style, including things like consistent indentation, appropriate identifiers, useful comments, suitable documentation, etc. Simple, clean, readable code is what you should be aiming for. Performance refers to how fast your program can produce the required results compared to other submissions. Design refers to proper modularization and the proper choice of algorithms and data structures. Functionality refers to your programs being able to do what they should according to the specification given above; if the specification is ambiguous and you had to make a certain choice, defend that choice in your README file.

If your programs cannot be built you will get no points whatsoever. If your programs cannot be built without warnings using gcc -ansi -pedantic -Wall -Wextra -std=c99 -O we will take off 10% (except if you document a very good reason). If your programs cannot be built using make we will take off 10%. If your programs fail miserably even once, i.e. terminate with an exception of any kind or dump core, we will take off 10%. Finally, make sure to include your name and email address in every file you turn in!