Spring Semester 2006: January 30, 2006 - May 5, 2006
Out on:
April 14, 2006
Due by:
April 21, 2006 by 5:59 pm for full credit (11:59 pm for 10% off, hard deadline)
Collaboration:
Teams
Grading:
Packaging 10%, Style 10%, Performance 10%, Design 10%, Functionality 60%
The tenth assignment for 600.226: Data Structures once again deals with maps. No, not those kinds of maps, but actual maps, of the Baltimore area to be exact. Your "big project" from now until the end of the semester is to implement a variation on the theme that made MapQuest, Yahoo! Maps, and of course Google Maps famous.
This is a team assignment, and all of you should
know which team you're in (I hope).
Each team hands in one assignment!
Decide early on who is going to be responsible for submitting the
assignment and when.
Make sure to include all the relevant information (who is in the
team?) in your README file!
All members of a team will receive the same score for the product
you submit together.
You are a startup company who decided to go into the hot field of geographic information systems (GIS). So far you were not able to obtain a lot of venture capital because investors are still skeptical about funding ideas instead of working products after the dot-com bust. What you need to do, and do quickly before your money runs out in about four weeks, is produce a working prototype of your brilliant ideas for a cool mapping application.
Another complication is that about eleven other startup companies in the area had the same idea, and all of you are now competing for capital from the same lazy bunch of investors. So whatever you produce in the next four weeks had better be "special enough" to outdo the competition and secure your funding. Want more "bad news"? The money you are spending right now to keep your development running came from Sun Microsystems. Of course those guys insisted that you use Java and only standard Java libraries for your prototype, no additional libraries whatsoever. So you're stuck with that as well...
Disclaimer: Except for the restriction to standard Java and standard Java libraries, the above account is entirely ficticious. Your task, however, is to internalize the story anyway! :-)
Your first real task is to get a hold of the relevant
data for your maps.
I was surprised to find that the U.S.
Census Bureau offers such data for free:
Check out the TIGER
project, and specifically the
TIGER/Line file sets.
There is data for all of the United States, but I
strongly recommend that you concentrate on just
Baltimore ("Charm City") for now.
You can use this list of state and county codes to figure out which files you need to
download; the FIPS code
for Baltimore, MD is 24510; the relevant information is located in the
file MD/tgr24510.zip on the
TIGER/Line website.
Before you can use any of this data, you obviously
first need to understand the information contained
in the tgr24510.zip file.
This is quite a task all by itself, the sheer amount of information
available can be a little overwhelming at first.
The good news is that the U.S. Census Buraeu provides all the necessary
documentation in
this PDF file;
the bad news is that the documentation itself
is also quite big and takes a while to grok.
However, there's no way around understanding the information, so you
should spend some time reading the PDF manual and exploring the various
data files contained in tgr24510.zip.
All team members should participate in this task, otherwise you'll always have someone without clue asking you how to get a certain piece of information!
Your goal for this problem is to write up a concise description of the available data as far as you understand it, and to outline what pieces of information you want to use for your project. You will obviously need all the basic geographic information, e.g. the coordinates of all relevant streets and intersections. But there is a lot of information that can be used for additional features that would set your project apart form those of other teams. For example, there are additional street-level details that allow more accurate measurement of distances or more accurate drawing of maps; there is information about different kinds of streets (freeway, unpaved road, etc); there is political and economic information (at least some basics); there are landmarks, parks, lots, etc.; and I think there is zoning information (residential, commercial) as well.
Write up your analysis of the available data in a plain Unix text
file called DATA; be as concise as you possibly can,
without being "sloppy" that is.
First describe the general structure of the data files contained in
tgr24510.zip and how you can cross-reference information
from one file to another file.
Then explain the basic geographic information you will need, what files
it is stored in, and how you are going to use it.
Finally, explain the additional information you think would be helpful
later to add "cool features" to your application, what files it is
stored in, and how you are going to use it.
Make sure you remember to hand your DATA file in when you
submit the rest of your assignment; place it into the same directory
as the README file.
Your second task is to implement a first prototype of your application. The prototype should have at least the following features:
Please don't make the mistake of writing code that will only allow you to do these specific things. If you do that, you'll hack yourself into a corner that will be hard to get out off when other teams "breeze by" with a more modular and flexible design, adding features left and right while you are still trying to fix your basic code.
Instead you should "take a step back" and identify the
key abstractions in your data and your application.
For example, you certainly want some kind of Position
abstraction for coordinates;
you probably want some kind of Segment abstraction that
represents a "connection" between two positions;
you mayh want some higher-level abstractions like Street,
consisting of several segments;
Intersection might be an abstraction as well, and there
are probably lots more.
A group brainstorming session over coffee or something might be useful,
and is often more efficient than doing stuff like this in email.
Also, identify the key use cases of your application; the features above are just examples, and they are just relevant for this first prototype. What you need to figure out (and another brainstorming session might be good for this as well) is all the things that you want a user to be able to do; well, maybe not all of them, but at least a good number of things. Once you have a list of use cases, try to order them by priorities: how important is it to have a certain feature? There's a lot of guess work here since you don't exactly know what the other teams will be doing, but give it a good shot. Finally, add the features the assignment requires at the top of your list, since you certainly want to get those done.
From this ordered list of use cases, pick the first 10 or so, and then try to explain (to each other) how you can implement each use case given the abstractions you identified before. This is where your knowledge of algorithms and data structures comes in, but it is also where you will probably notice that you can't explain in detail how some things should be done. That's a sign that you missed some abstraction, or that you're still not clear enough what kind of information you have and what kinds you need to compute; more research and experiments might be needed. Iterate this process for some time to get a good understanding of the kinds of modules and classes you'll need to write; but don't do it for too long, you still need the time to actually write the code.
Speaking of code, please don't try to write the whole application all at once. Instead, start by writing some experimental code (also known as a "tracer bullet") that does one particular thing that you don't know how to do yet. A good first "tracer bullet" is a short program that reads some data file and just prints out the pieces of information as you recognize them; if you write two or three of these (quick to do) you'll have a much better understanding of how the data files work, and you can use that experience (although maybe not the actual code) to design the part of your actual system that reads the data and builds certain data structures out of it. This applies to features of Java you don't know yet as well. For example, if you've never written a Java GUI, but you want to add one to stay ahead of the competition, start by writing very simple examples, don't try to write the GUI code for your application; again you'll gain experience with the Java API this way, experience that you can use to write the "real" code with much more confidence. The idea even applies to the mapping application as a whole: Develop in small, controlled steps; add one small feature, test it, debug it, make it work; then commit the code; then work on the next feature. Don't try to add too many things at once, you're more likely to lose track of what is going on and set your whole team back in the process.
Aside from the code you develop for this problem (and thus your project)
you must also include a plain Unix text file PLAN that contains
your "design notes" from the process outlined above; this doesn't have to
be some fancy prose document, but it should describe your key abstractions
and key use cases in enough detail for us to follow your ideas.
Here is a list of features you might want to think about as you are trying to produce the coolest mapping application in years:
There are many more features you can think of. I'll include a few more suggestions on the following assignments, when you are actually required to do proper route planning using Dijkstra's algorithm.
Please turn in a
gzip
compressed
tarball
of your assignment;
the filename should be
cs226-assign-10-teamcode.tar.gz
with teamcode
replaced by the code assigned to your team for your repository
and your mailing list.
The tarball should contain no derived files whatsoever
(i.e. no .class files, no .html files, etc.),
but allow building all derived files.
And your tarball should definitely not contain copies of
the data files, but instructions of where we have to put them for
your code to find them.
Include a README file that briefly explains what your
programs do and contains any other notes you want us to check out
before grading.
For reference, here is a short explanation of the grading criteria.
Packaging refers to the proper organization of the
stuff you hand in, following the guidelines for Deliverables above.
Style refers to Java programming style, including
things like consistent indentation, appropriate identifiers,
useful comments, suitable javadoc documentation, etc.
Simple, clean, readable code is what you should be aiming for.
Performance refers to how fast your program can
produce the required results compared to other submissions.
Design refers to proper modularization and the
proper choice of algorithms and data structures.
Functionality refers to your programs being
able to do what they should according to the specification
given above; if the specification is ambiguous and you had
to make a certain choice, defend that choice in your
README file.
If your programs cannot be built you will get no points whatsoever.
If your programs cannot be built without warnings using
javac -Xlint
we will take off 10% (except if you document a very good reason).
If your programs fail miserably even once,
i.e. terminate with an exception of any kind,
we will take off 10%.
This project is based on a similar assignment by
Prof. Dr. Jason Eisner:
Thanks a bunch for the inspiration! :-)
You are free to read
Jason's assignment
which contains lots of advice and background information
that I am not providing.
However, remember that you are doing this
version, not Jason's version.
If you use any of Jason's ideas, please give proper credit in your
README file.