Internal note for our chair: each of us should offer / provide 12 months of supervision every year. Each thesis or project "counts" just like it counts for the students, where #months = #ECTS / 4. That is, B.Sc. project = 6 ECTS = 1.5 months, M.Sc. project = 16 ECTS = 4 months, B.Sc. thesis = 12 ECTS = 3 months, M.Sc. thesis = 25 ECTS = 6 months.
Internal note to thesis supervisors: you need to make sure that each supervised thesis appears on our chair homepage. For this you need to: 1) put the final PDFs of the thesis and the presentation into /nfs/raid3/publications/theses - please follow the existing naming convention!; the PDFs will be available at http://ad-publications.informatik.uni-freiburg.de/theses/<PDF> 2) send a mail to Sabine with the following information: title of thesis, name of student, thesis abstract, link to thesis PDF, link to presentation PDF.
List of available and ongoing topics for current B.Sc. and M.Sc. projects and theses
Note to interested students: if all projects in this list are ongoing, this means that our current capacity for supervising projects and theses is reached. Maybe, if you come back later, there will be an offer again. You can also propose a topic on your own.
Deep Learning for NLP (master project or thesis, available): We have several NLP problems (e.g., relation classification) in which it would be interesting to apply recent deep learning techniques. As a project the task would mainly be to re-implement some existing work and compare it to (simpler) baselines. As a thesis this would extend to improving performance using new ideas. Ideally, you already know about neural networks and have implemented and/or trained them. You should, however, know the basics of machine learning and in particular have practical experience.supervised by Elmar Haussmann
Deep Dive (project, available): DeepDive is a framework from the Stanford University that can construct a knowledge base (a database of facts like "Barack Obama" has-spouse "Michelle Obama") by analyzing large text corpora. It applies a large variety of state-of-the-art NLP techniques in distant-supervision and probabilistic inference. The project's goal would be 1) to evaluate the framework by applying it to a large text corpus 2) incorporate advanced methods or features to improve/enhance extraction quality. You should be confident in working in a Linux environment and you need practical experience in programming. supervised by Elmar Haussmann
Movie Search (thesis, ongoing): Our semantic search engine Broccoli answers semantic queries. Queries can, in theory, be very complex and powerful. However, it is almost impossible for inexperienced users to build appropriate structured queries. Goal of this thesis is to apply and adapt the search paradigm behind Broccoli to a narrow domain. Think of a search for movies where you can search by all the common things, i.e. actors, the director, year, genre, but also by keyword search in the plot. However, for plot search, Broccoli features can come in handy. Looking for the movie where the charcter played by Di Caprio has drug problems? Linking the characters in the plot to knowledge-base entries can help a lot. For this thesis you will deal with many aspects that make the thesis far more researchy than technical. What is a good set of features to support? How can you achive a good user experience? Where to get good data about movies (structured and plots)? Does it make sense to take general knowledge data (like that from Broccoli) and link it to the data about the movies? It may be necessary at some point to add new operations to Broccoli. It would be good if you were experienced in (or willing to pick up) multiple programming languages. Between data integration, user interfaces and Brococli server code, it really makes sense to pick a suitable languge for each job. supervised by Björn Buchhold
Efficient code for (de)compression (project, ongoing): Our semantic search engine Broccoli and its successor (under development) work with compressed lists a lot. Current implementations are fast but far from optimal and have never been compared to other available compression schemes. Goal of the project is to tune the current (C++) implementation for speed, to evaluate the literature for promising alternatives and to compare them. supervised by Björn Buchhold
Structured extraction of text from scientific PDF documents (master thesis, ONGOING) A lot of documents are only available in the PDF format which was originally intended for platform independent uniform display and printing. To process the contained information one has to extract the contained plain text. To be able to do this correctly one has to consider the formatting structure of the documents to be able to identify the parts of the text that belong together. The goal is to create such a non-trivial structured text-extraction on top of an available PDF library like, e.g., PDFBox so that the contained texts can be further processed by our search engine ... supervised by Claudius Korzen
A mobile app for kitchen management (project, ONGOING) Our comfortable kitchen provides a lot of drinks and food. Whenever an employee consumes something, he is asked to mark it on our tally sheet and to pay the accumulated amount whenever he wants. In general, the noted amount is quite smaller than the actual amount to pay due to some uncertainties about the prices and due to the oblivion of our staff. Moreover, the payment behavior could be better in general. The goal of this project is to implement a mobile app, that can be used to manage the kitchen accounts of our staff. At least, it should be possible to scan the product before consuming and to debit the employee's account with the corresponding price. In certain intervals, a notice to pay should be sent to each employee. Can be expanded by any number of additional features ... supervised by Claudius Korzen
Learning from OSM data (theses or larger projects, available) OpenStreetMap (OSM) bears a lot of data that geo-search and render engines cannot use sufficiently at the moment. A variety of topics is available in that area, e.g. landscape classification based on street network data, learning classifiers for regions of interest (e.g. industrial areas), developing data structures to answer queries with specific geo-relations ('next to', 'south of', 'along'), automated tagging and classification of points of interest (e.g. lake -> place to swim), etc. ... supervised by Sabine Storandt
Question Answering using Semantic Full-Text Search (thesis, available): Use our semantic full-text search engine Broccoli for the FACTOID and LIST questions from one of the TREC Questions Answering benchmarks, e.g. http://trec.nist.gov/data/qa/2006_qadata/QA2006_testset.xml . What is the performance of manually constructed queries, and what can be done to improve it? Good if you heard the information retrieval lecture before, but not absolutely necessary ... supervised by Hannah Bast and [Elmar or Björn]
An easy-to-use web app for the CompleteSearch engine (project, available): The CompleteSearch engine complex search capabilities (including prefix search and faceted search) on semi-structured data (text and databases) with very fast query times. Our current software is powerful and flexible, but has quite a learning curve before it can be used. The goal of this project would be to set up a web app, where one can upload any CSV dataset, and then have a convenient search (with meaningful default settings), without having to set up anything oneself. This would be an extremely useful web application ... supervised by Hannah Bast
Completed B.Sc. or M.Sc. projects and theses
[Die Liste ist noch unvollständig, insbesondere fehlen gerade noch die ganzen Projekte. Titel der Arbeit sollte da auch noch stehen und ein Link zur jeweiligen Webseite bzw. Arbeit und Präsentation. Und das Anfangs- und Enddatum.]
M.Sc. thesis Max Lotstein (Elmar)
B.Sc. thesis Manuel Ruder (Elmar)
M.sc. thesis Niklas Meinzer (Sabine)
B.Sc. thesis Christian Ehrenfeld (Hannah)
B.Sc. thesis Philipp Bausch (Elmar)
M.Sc. thesis Eugen Sawin (Hannah)
M.Sc. thesis Patrick Brosi (Hannah + Sabine)
B.Sc. thesis Marius Bethge (Björn)
M.Sc. thesis Cynthia Jimenez (Sabine)
M.Sc. thesis Jonas Sternisko (Hannah)
M.Sc. thesis Ragavan Natarajan (Florian)
B.Sc. thesis Benjamin Meier (Claudius)
M.Sc. thesis Susanne Eichel (Hannah)
B.Sc. thesis Axel Lehmann (Hannah)
B.Sc. thesis Adrian Batzill (Hannah)
B.Sc. thesis Anton Stepan (Björn)
M.Sc. thesis Mirko Brodesser (Hannah + Sabine)
M.Sc. thesis Manuel Braun (Hannah + Sabine)
B.Sc. thesis Robin Schirrmeister (Hannah)
B.Sc. thesis Simon Skilevic (Hannah)
M.Sc. thesis Ilinca Tudose (Hannah + Elmar)
B.Sc. thesis Christiane Schaffer (Florian)
M.Sc. thesis Dirk Kienle (Hannah)
B.Sc. thesis Ina Baumgarten (Hannah + Björn)
B.Sc. thesis Niklas Meinzer (Hannah + Björn)
M.Sc. thesis Claudius Korzen (Hannah)
M.Sc. thesis Florian Bäurle (Hannah)
M.Sc. thesis Elmar Haußmann (Hannah)
M.Sc. thesis Oliver Mitevski (Hannah + Marjan)
M.Sc. thesis Björn Buchhold (Hannah)
Diploma thesis Johannes Schwenk (Hannah)
B.Sc. thesis Mirko Brodesser (Hannah + Marjan)