|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--geography.AddressFinder
Maps from a string like "3400 charles street" to a address nearby whose coordinates we actually know, such as 3398 N Charles St. (The known addresses come from StreetSegments added to this AddressFinder.) We try to make a decent guess even if the user types in "3403 charle road." or something like that.
This process of locating addresses is generally known as geocoding. We are not solving the whole geocoding problem here. An address finder for the whole country (or world) would have to locate the city first. We're just building an address finder for a single city.
A really good solution to this might use some kind of approximate string matching null; //
What we do currently is quite a bit simpler than that, but still pretty reasonable. First we look up the street in an alphabetically ordered dictionary of streets we know about. If we can't find it exactly, then we pick the closest one. Then we look up the street number in a numerically ordered dictionary for that street. Again, if we can't find it exactly, we pick the closest one.
We will use red-black trees for both dictionaries, courtesy of the java.util.TreeMap class. These implement the java.util.SortedMap interface (Java's name for an ordered dictionary).
Note that we have a nested data structure -- a big dictionary whose elements are little dictionaries. We use the big dictionary to turn an approximate street name into a Street object, which contains a little dictionary that we use to turn an approximate house number on that street into an address whose location we know.
There are a few tricks we use to make the approximate matching work better:
Similarly, if the user just types "Charlas St", it falls between Charing Cross Rd and Charlcote Pl. But we would be better off picking Charles St, because it is a major road, and it has just as long a prefix in common with Charles St as it does with Charlcote Pl.
So in fact, we do not limit our choice to just the two keys that the search key falls between alphabetically. We consider ALL keys that share the longest possible prefix in common with the search key (e.g., "Falls" or "Charl"). Those keys form a (usually short) contiguous subsequence of the sorted dictionary. We choose the biggest street in that subsequence.
How do we know which is the biggest street? The one that has the most known addresses is the one that the user is most likely to be asking about.
| Nested Class Summary | |
protected class |
AddressFinder.Street
A class that represents information about a particular street. |
| Field Summary | |
protected java.util.SortedMap |
smStreets
The "big map" that stores Streets, keyed by simplified versions of their names (see simplify()). |
| Constructor Summary | |
AddressFinder()
Constructor: New address finder, contains no information yet. |
|
| Method Summary | |
void |
addStreetSegment(geography.StreetSegment ss)
In general, a StreetSegment specifies 2 intersections and up to 4 addresses (up to two per intersection). |
protected static java.lang.Object[] |
closestKeys(java.util.SortedMap sm,
java.lang.Object key)
Look up key in the SortedMap sm. |
geography.Intersection |
findIntersectionNear(java.lang.String address)
Finds an intersection near this address. |
protected AddressFinder.Street |
findOrInsertStreet(geography.StreetSegment ss)
Given a StreetSegment, look up its Street by name in the database. |
protected AddressFinder.Street |
findStreetLike(java.lang.String name)
Find the street whose name is most similar to the given name, breaking ties in favor of streets with more addresses. |
protected static java.lang.String |
keyFromStreetName(java.lang.String streetName)
|
private static int |
lcpLength(java.lang.String s1,
java.lang.String s2)
|
int |
numKnownStreets()
Number of street names that this AddressFinder knows about. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
protected java.util.SortedMap smStreets
| Constructor Detail |
public AddressFinder()
| Method Detail |
public void addStreetSegment(geography.StreetSegment ss)
public int numKnownStreets()
public geography.Intersection findIntersectionNear(java.lang.String address)
throws DataFormatException
DataFormatException - if the address is too grungy
for us to deal with.protected AddressFinder.Street findOrInsertStreet(geography.StreetSegment ss)
protected AddressFinder.Street findStreetLike(java.lang.String name)
protected static java.lang.String keyFromStreetName(java.lang.String streetName)
protected static java.lang.Object[] closestKeys(java.util.SortedMap sm,
java.lang.Object key)
private static int lcpLength(java.lang.String s1,
java.lang.String s2)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||