600.226: Data Structures

Spring Semester 2006: January 30, 2006 - May 5, 2006

Assignment 10: Mapping Charm City

Out on: April 14, 2006
Due by: April 21, 2006 by 5:59 pm for full credit (11:59 pm for 10% off, hard deadline)
Collaboration: Teams
Grading: Packaging 10%, Style 10%, Performance 10%, Design 10%, Functionality 60%

Overview

The tenth assignment for 600.226: Data Structures once again deals with maps. No, not those kinds of maps, but actual maps, of the Baltimore area to be exact. Your "big project" from now until the end of the semester is to implement a variation on the theme that made MapQuest, Yahoo! Maps, and of course Google Maps famous.

This is a team assignment, and all of you should know which team you're in (I hope). Each team hands in one assignment! Decide early on who is going to be responsible for submitting the assignment and when. Make sure to include all the relevant information (who is in the team?) in your README file! All members of a team will receive the same score for the product you submit together.

Background Briefing

You are a startup company who decided to go into the hot field of geographic information systems (GIS). So far you were not able to obtain a lot of venture capital because investors are still skeptical about funding ideas instead of working products after the dot-com bust. What you need to do, and do quickly before your money runs out in about four weeks, is produce a working prototype of your brilliant ideas for a cool mapping application.

Another complication is that about eleven other startup companies in the area had the same idea, and all of you are now competing for capital from the same lazy bunch of investors. So whatever you produce in the next four weeks had better be "special enough" to outdo the competition and secure your funding. Want more "bad news"? The money you are spending right now to keep your development running came from Sun Microsystems. Of course those guys insisted that you use Java and only standard Java libraries for your prototype, no additional libraries whatsoever. So you're stuck with that as well...

Disclaimer: Except for the restriction to standard Java and standard Java libraries, the above account is entirely ficticious. Your task, however, is to internalize the story anyway! :-)

Problem 1: Grappling with Data

Your first real task is to get a hold of the relevant data for your maps. I was surprised to find that the U.S. Census Bureau offers such data for free: Check out the TIGER project, and specifically the TIGER/Line file sets. There is data for all of the United States, but I strongly recommend that you concentrate on just Baltimore ("Charm City") for now. You can use this list of state and county codes to figure out which files you need to download; the FIPS code for Baltimore, MD is 24510; the relevant information is located in the file MD/tgr24510.zip on the TIGER/Line website.

Before you can use any of this data, you obviously first need to understand the information contained in the tgr24510.zip file. This is quite a task all by itself, the sheer amount of information available can be a little overwhelming at first. The good news is that the U.S. Census Buraeu provides all the necessary documentation in this PDF file; the bad news is that the documentation itself is also quite big and takes a while to grok. However, there's no way around understanding the information, so you should spend some time reading the PDF manual and exploring the various data files contained in tgr24510.zip.

All team members should participate in this task, otherwise you'll always have someone without clue asking you how to get a certain piece of information!

Your goal for this problem is to write up a concise description of the available data as far as you understand it, and to outline what pieces of information you want to use for your project. You will obviously need all the basic geographic information, e.g. the coordinates of all relevant streets and intersections. But there is a lot of information that can be used for additional features that would set your project apart form those of other teams. For example, there are additional street-level details that allow more accurate measurement of distances or more accurate drawing of maps; there is information about different kinds of streets (freeway, unpaved road, etc); there is political and economic information (at least some basics); there are landmarks, parks, lots, etc.; and I think there is zoning information (residential, commercial) as well.

Write up your analysis of the available data in a plain Unix text file called DATA; be as concise as you possibly can, without being "sloppy" that is. First describe the general structure of the data files contained in tgr24510.zip and how you can cross-reference information from one file to another file. Then explain the basic geographic information you will need, what files it is stored in, and how you are going to use it. Finally, explain the additional information you think would be helpful later to add "cool features" to your application, what files it is stored in, and how you are going to use it. Make sure you remember to hand your DATA file in when you submit the rest of your assignment; place it into the same directory as the README file.

Problem 2: The Mapping Prototype

Your second task is to implement a first prototype of your application. The prototype should have at least the following features:

Please don't make the mistake of writing code that will only allow you to do these specific things. If you do that, you'll hack yourself into a corner that will be hard to get out off when other teams "breeze by" with a more modular and flexible design, adding features left and right while you are still trying to fix your basic code.

Instead you should "take a step back" and identify the key abstractions in your data and your application. For example, you certainly want some kind of Position abstraction for coordinates; you probably want some kind of Segment abstraction that represents a "connection" between two positions; you mayh want some higher-level abstractions like Street, consisting of several segments; Intersection might be an abstraction as well, and there are probably lots more. A group brainstorming session over coffee or something might be useful, and is often more efficient than doing stuff like this in email.

Also, identify the key use cases of your application; the features above are just examples, and they are just relevant for this first prototype. What you need to figure out (and another brainstorming session might be good for this as well) is all the things that you want a user to be able to do; well, maybe not all of them, but at least a good number of things. Once you have a list of use cases, try to order them by priorities: how important is it to have a certain feature? There's a lot of guess work here since you don't exactly know what the other teams will be doing, but give it a good shot. Finally, add the features the assignment requires at the top of your list, since you certainly want to get those done.

From this ordered list of use cases, pick the first 10 or so, and then try to explain (to each other) how you can implement each use case given the abstractions you identified before. This is where your knowledge of algorithms and data structures comes in, but it is also where you will probably notice that you can't explain in detail how some things should be done. That's a sign that you missed some abstraction, or that you're still not clear enough what kind of information you have and what kinds you need to compute; more research and experiments might be needed. Iterate this process for some time to get a good understanding of the kinds of modules and classes you'll need to write; but don't do it for too long, you still need the time to actually write the code.

Speaking of code, please don't try to write the whole application all at once. Instead, start by writing some experimental code (also known as a "tracer bullet") that does one particular thing that you don't know how to do yet. A good first "tracer bullet" is a short program that reads some data file and just prints out the pieces of information as you recognize them; if you write two or three of these (quick to do) you'll have a much better understanding of how the data files work, and you can use that experience (although maybe not the actual code) to design the part of your actual system that reads the data and builds certain data structures out of it. This applies to features of Java you don't know yet as well. For example, if you've never written a Java GUI, but you want to add one to stay ahead of the competition, start by writing very simple examples, don't try to write the GUI code for your application; again you'll gain experience with the Java API this way, experience that you can use to write the "real" code with much more confidence. The idea even applies to the mapping application as a whole: Develop in small, controlled steps; add one small feature, test it, debug it, make it work; then commit the code; then work on the next feature. Don't try to add too many things at once, you're more likely to lose track of what is going on and set your whole team back in the process.

Aside from the code you develop for this problem (and thus your project) you must also include a plain Unix text file PLAN that contains your "design notes" from the process outlined above; this doesn't have to be some fancy prose document, but it should describe your key abstractions and key use cases in enough detail for us to follow your ideas.

Some Possible Features

Here is a list of features you might want to think about as you are trying to produce the coolest mapping application in years:

There are many more features you can think of. I'll include a few more suggestions on the following assignments, when you are actually required to do proper route planning using Dijkstra's algorithm.

Deliverables

Please turn in a gzip compressed tarball of your assignment; the filename should be cs226-assign-10-teamcode.tar.gz with teamcode replaced by the code assigned to your team for your repository and your mailing list. The tarball should contain no derived files whatsoever (i.e. no .class files, no .html files, etc.), but allow building all derived files. And your tarball should definitely not contain copies of the data files, but instructions of where we have to put them for your code to find them. Include a README file that briefly explains what your programs do and contains any other notes you want us to check out before grading.

Grading

For reference, here is a short explanation of the grading criteria. Packaging refers to the proper organization of the stuff you hand in, following the guidelines for Deliverables above. Style refers to Java programming style, including things like consistent indentation, appropriate identifiers, useful comments, suitable javadoc documentation, etc. Simple, clean, readable code is what you should be aiming for. Performance refers to how fast your program can produce the required results compared to other submissions. Design refers to proper modularization and the proper choice of algorithms and data structures. Functionality refers to your programs being able to do what they should according to the specification given above; if the specification is ambiguous and you had to make a certain choice, defend that choice in your README file.

If your programs cannot be built you will get no points whatsoever. If your programs cannot be built without warnings using javac -Xlint we will take off 10% (except if you document a very good reason). If your programs fail miserably even once, i.e. terminate with an exception of any kind, we will take off 10%.

Kudos

This project is based on a similar assignment by Prof. Dr. Jason Eisner: Thanks a bunch for the inspiration! :-) You are free to read Jason's assignment which contains lots of advice and background information that I am not providing. However, remember that you are doing this version, not Jason's version. If you use any of Jason's ideas, please give proper credit in your README file.

Updated: $Id: assignment-10.html 445 2006-04-20 02:44:45Z phf $ Validate: XHTML CSS
Copyright © 2005-2006 Peter H. Fröhlich. All rights reserved.