## page was renamed from BachelorAndMasterProjectsAndTheses ## page was renamed from BachelorAndMasterProjects #acl All:read ''Internal note for our chair: each of us should offer / provide 12 months of supervision every year. Each thesis or project "counts" just like it counts for the students, where #months = #ECTS / 4. That is, B.Sc. project = 6 ECTS = 1.5 months, M.Sc. project = 16 ECTS = 4 months, B.Sc. thesis = 12 ECTS = 3 months, M.Sc. thesis = 25 ECTS = 6 months.'' ''Internal note to thesis supervisors: you need to make sure that each supervised thesis appears on our chair homepage. For this you need to: 1) put the final PDFs of the thesis and the presentation into /nfs/raid3/publications/theses - please follow the existing naming convention!; the PDFs will be available at [[http://ad-publications.informatik.uni-freiburg.de/theses/|http://ad-publications.informatik.uni-freiburg.de/theses/]] 2) send a mail to Sabine with the following information: title of thesis, name of student, thesis abstract, link to thesis PDF, link to presentation PDF.'' == List of available and ongoing topics for current B.Sc. and M.Sc. projects and theses == ''Note to interested students: if all projects in this list are ongoing, this means that our current capacity for supervising projects and theses is reached. Maybe, if you come back later, there will be an offer again. You can also propose a topic on your own.'' '''Movie Search (thesis, ongoing)''': Our semantic search engine [[http://broccoli.cs.uni-freiburg.de|Broccoli]] answers semantic queries. Queries can, in theory, be very complex and powerful. However, it is almost impossible for inexperienced users to build appropriate structured queries. Goal of this thesis is to apply and adapt the search paradigm behind Broccoli to a narrow domain. Think of a search for movies where you can search by all the common things, i.e. actors, the director, year, genre, but also by keyword search in the plot. However, for plot search, Broccoli features can come in handy. Looking for the movie where the charcter played by Di Caprio has drug problems? Linking the characters in the plot to knowledge-base entries can help a lot. For this thesis you will deal with many aspects that make the thesis far more researchy than technical. What is a good set of features to support? How can you achive a good user experience? Where to get good data about movies (structured and plots)? Does it make sense to take general knowledge data (like that from Broccoli) and link it to the data about the movies? It may be necessary at some point to add new operations to Broccoli. It would be good if you were experienced in (or willing to pick up) multiple programming languages. Between data integration, user interfaces and Brococli server code, it really makes sense to pick a suitable languge for each job. ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/buchhold|Björn Buchhold]]'' '''Efficient code for (de)compression (project, ongoing)''': Our semantic search engine [[http://broccoli.cs.uni-freiburg.de|Broccoli]] and its successor (under development) work with compressed lists a lot. Current implementations are fast but far from optimal and have never been compared to other available compression schemes. Goal of the project is to tune the current (C++) implementation for speed, to evaluate the literature for promising alternatives and to compare them. ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/buchhold|Björn Buchhold]]'' ## '''Value Recognition in Full-Text (thesis, available)''': Text documents contain various values of different types. Dates, weights, lengths, heights and many more can be expressed obviously like ''13kg'', less obivously, e.g., ''ten tons'', and highly obfuscated, e.g., ''for the first half of the year''. For our semantic search engine [[http://broccoli.cs.uni-freiburg.de|Broccoli]], we do not make use of this inforamtion, yet. The taks of the thesis is to produce a ([[http://uima.apache.org/|UIMA]]) component that extracts such values from text documents, that serve as input for further steps to enable semantic search that matches such values. Additionally the findings should be evaluated. Facts which kinds of values are truely hard to extract and which are relatively easy, is very valuable knowledge as well. - ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/buchhold|Björn Buchhold]]'' ## '''Setting up a browser for DBPedia with Broccoli (project, available)''' We derived a simplified version of the [[http://www.freebase.com/|Freebase]] ontology and set up a web application to browse the data based on our Broccoli search engine, called Freebase Easy: http://freebase-easy.cs.uni-freiburg.de . The goal of this project would be to do something similar with the [[http://dbpedia.org/|DBPedia]] ontology. This means modifying the freely downloadable dataset so that it 1) can be processed by our pipeline and 2) has a form that allows for a comfortable browsing with our user interface (e.g. providing unique and proper human readable names for instances and types) ... ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/baeurlef|Florian Bäurle]]'' ## '''A keyword query translator for Broccoli (bachelor thesis, ONGOING)''' Our Broccoli search engine has its own special query language (SPARQL-like trees). Though the userinterface guides users in the incremental query creation process, many users got accustomed to simple keyword queries like they are used, e.g., when searching with Google. To make the interface more attractive for these users and for the general convenience of a quick query creation, it would be nice to have a mechanism that translates normal keyword queries into equivalent structured Broccoli queries. We already implemented an experimental mechanism that can make such translations (e.g., just try entering "mafia films by Francis Coppola" into the input field of the userinterface) as a proof of concept. The work of this thesis would be to implement a better, more powerful query translator ... ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/baeurlef|Florian Bäurle]]'' ## '''Usability study for the Broccoli user interface (bachelor thesis, available)''' We developed a special user interface for the prototype of our semantic full-text search engine Broccoli: http://broccoli.cs.uni-freiburg.de . An open taks is yet to make a thorough user study to evaluate the usability of the interface, compare it to other search interfaces and identify weak points that could be improved ... ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/baeurlef|Florian Bäurle]]'' '''Question Answering using Semantic Full-Text Search (thesis, available)''': Use our semantic full-text search engine Broccoli for the FACTOID and LIST questions from one of the TREC Questions Answering benchmarks, e.g. http://trec.nist.gov/data/qa/2006_qadata/QA2006_testset.xml . What is the performance of manually constructed queries, and what can be done to improve it? Good if you heard the information retrieval lecture before, but not absolutely necessary ... ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/bast|Hannah Bast]] and [Elmar or Björn]'' '''An easy-to-use web app for the !CompleteSearch engine (project, available)''': The !CompleteSearch engine complex search capabilities (including prefix search and faceted search) on semi-structured data (text and databases) with very fast query times. Our current software is powerful and flexible, but has quite a learning curve before it can be used. The goal of this project would be to set up a web app, where one can upload any CSV dataset, and then have a convenient search (with meaningful default settings), without having to set up anything oneself. This would be an extremely useful web application ... ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/bast|Hannah Bast]]'' ''' PDF to TeX Parser (project or thesis, ongoing)''': Per default, you can't rearrange or edit texts in PDF files. As long as you don't have the TeX sources of a PDF, there is no chance to edit the PDF: While creating a PDF from TeX file(s) is a quite common feature, the other way around - recovering the TeX source(s) from given PDF(s) - isn't. The goal of this project is to implement a parser that produces TeX sources from given PDF files to be able to edit the sources and to recompile them to a new PDF ... '' also available as master thesis if combined with the topic "Formula recognition in PDF files" (see below)'' ... ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/korzen|Claudius Korzen]]'' ## ''' Formula recognition in PDF files (project or thesis, available)''': Per default, text and formulas in a PDF file are "drawn" into the PDF without any semantic markups. This makes it difficult to understand the meaning of a formula on extracting text / formulas from PDF files. The goal of this project is to implement a parser, that identifies and "understands" the formulas in a given PDF, e.g. by translating the formulas back to their correspondent TeX commands. ... ''also available as master thesis if combined with the topic "PDF to TeX Parser" (see above)'' ... ''supervised by [[http://ad.informatik.uni-freiburg.de/staff/korzen|Claudius Korzen]]'' == Completed B.Sc. or M.Sc. projects and theses == ''[Die Liste ist noch unvollständig, insbesondere fehlen gerade noch die ganzen Projekte. Titel der Arbeit sollte da auch noch stehen und ein Link zur jeweiligen Webseite bzw. Arbeit und Präsentation. Und das Anfangs- und Enddatum.]'' M.Sc. thesis Max Lotstein (Elmar) B.Sc. thesis Manuel Ruder (Elmar) M.sc. thesis Niklas Meinzer (Sabine) B.Sc. thesis Christian Ehrenfeld (Hannah) B.Sc. thesis Philipp Bausch (Elmar) M.Sc. thesis Eugen Sawin (Hannah) M.Sc. thesis Patrick Brosi (Hannah + Sabine) B.Sc. thesis Marius Bethge (Björn) M.Sc. thesis Cynthia Jimenez (Sabine) M.Sc. thesis Jonas Sternisko (Hannah) M.Sc. thesis Ragavan Natarajan (Florian) B.Sc. thesis Benjamin Meier (Claudius) M.Sc. thesis Susanne Eichel (Hannah) B.Sc. thesis Axel Lehmann (Hannah) B.Sc. thesis Adrian Batzill (Hannah) B.Sc. thesis Anton Stepan (Björn) M.Sc. thesis Mirko Brodesser (Hannah + Sabine) M.Sc. thesis Manuel Braun (Hannah + Sabine) B.Sc. thesis Robin Schirrmeister (Hannah) B.Sc. thesis Simon Skilevic (Hannah) M.Sc. thesis Ilinca Tudose (Hannah + Elmar) B.Sc. thesis Christiane Schaffer (Florian) M.Sc. thesis Dirk Kienle (Hannah) B.Sc. thesis Ina Baumgarten (Hannah + Björn) B.Sc. thesis Niklas Meinzer (Hannah + Björn) M.Sc. thesis Claudius Korzen (Hannah) M.Sc. thesis Florian Bäurle (Hannah) M.Sc. thesis Elmar Haußmann (Hannah) M.Sc. thesis Oliver Mitevski (Hannah + Marjan) M.Sc. thesis Björn Buchhold (Hannah) Diploma thesis Johannes Schwenk (Hannah) B.Sc. thesis Mirko Brodesser (Hannah + Marjan)