2295
Comment:
|
← Revision 4 as of 2020-08-24 09:43:18 ⇥
2225
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
= A Search Engine for OpenStreetMap Data (project and/or thesis) = |
Background info: OpenStreetMap is a collaborative mapping project and basically consists of geographically located points ("nodes"), lines ("ways") and groups of them ("relations"). All objects can be further described by means of key=value pairs, which are mainly used to give them certain attributes (for example "name"="Freiburg im Breisgau"). There are a number of search engines available for this data (see http://wiki.openstreetmap.org/wiki/Search_engines), and many of them are open source. The "best" engine at the moment is Nominatim, which is also used by the official OSM map page. It has a debug interface for testing (http://nominatim.openstreetmap.org/). Nominatim builds on a PostgreSQL database.
Goal: You should develop a search engine for OSM data that aims to be comparable in speed and quality to Nominatim. It does not matter of you don't reach this goal, but you should strive for it nevertheless.
Optional goal: Build a small, exemplary web-application that uses your backend to provide a searchable map.
Step 1: Familiarize yourself with raw OSM data. Download some smaller .osm file and open it in a text editor (preferrably vim, other editors tend to dislike text files that are multiple GB in size). Search around in the file. Try to find some city. Try to find a popular tourist attraction. Try to find a place that sells mexican food. Play around with the OSM toolkit (osmfilter, osmconvert ...) Should not take more than 1-2 days.
Step 2: Start coding. Build a parser for OSM data (keep memory effiency in mind).
Step 3: Select one or multiple attributes (for example, "name") and build an inverted index with it. Write tests for it. Think of a useful interface for your index class and put it into an abstract class.
Step 4: Build a HTTP server (use for example your code from the Information Retrieval lecture as a starting point) that takes an instance of the abstract index class and answers queries
Step 5: Realize that the results from Nominatim are still infinitely superior to yours
Step 6: Think of ways to change this
Step 7: Goto 5.