5843
Comment:
|
8175
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
Bachelor's and Master's Projects and Theses at the Chair for Algorithms and Data Structures. ''Note: This is a new page, created on 07-02-2017. The old page, which was outdated and in a terrible format, can still be found [[BachelorAndMasterProjectsAndThesesOld|here]].'' |
#acl Claudius Korzen:read,write Patrick Brosi:read,write Axel Lehmann:read,write Björn Buchhold:read,write All:read Bachelor's and Master's Projects and Theses at the Chair for Algorithms and Data Structures. ''Note: This is a new page, created on 07-02-2017. The old page, which was outdated and in a terrible format, can still be found [[BachelorAndMasterProjectsAndThesesOld|here]].'' |
Line 6: | Line 6: |
== Deliverables == | = Application for a project or thesis = If you are interested in doing a project or theses with us, please carefully read this page first and then send an e-mail to the prospective supervisor with the following information: |
Line 8: | Line 9: |
'''B.Sc. and M.Sc. thesis:''' (1) written thesis; (2) oral presentation (20 minutes + question); (3) well-documented version of your code and data, which allow reproducibility of your results, in our SVN. Here are some [[ThesesGuideline|guidelines on how to write a good thesis]]. See the list at the end of this page for many examples (of submitted theses and the accompanying presentation). | 1. Acknowledge that you have read this whole page (not just the first section) carefully. 1. Which courses have you already taken with us. 1. A very short description of your interests and your strengths (concerning work on a project/thesis). 1. A transcript of the grades of the courses you have take so far. 1. If you have an own project in mind (not necessary): a very short description of the goal and the scientific merit. |
Line 10: | Line 15: |
= Deliverables of a project or thesis = | |
Line 12: | Line 18: |
== Available projects and theses (click on the titles for more information) == | '''B.Sc. and M.Sc. thesis:''' (1) written thesis; (2) oral presentation (20 minutes + question); (3) well-documented version of your code and data, which allow reproducibility of your results, in our SVN. Here are some [[ThesesGuidelines|guidelines on how to write a good thesis]]. See the list at the end of this page for many examples (of submitted theses and the accompanying presentation). |
Line 14: | Line 20: |
[[BachelorAndMasterProjectsAndTheses/TokenizationRepair|Tokenization Repair (project and/or thesis)]]: Interesting and well-defined problem, the solution of which is relevant in a variety of information retrieval scenarios. Simple rule-based solutions come to mind easily, but machine learning is key to get very good results. A background in machine learning, or a strong willingness to aquire one as part of the project/thesis, is therefore mandatory for this project. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/bast|Hannah Bast]]. | = Available and ongoing projects and theses = {{attachment:available.png|Available|align="top"}} = Available; {{attachment:ongoing.png|Ongoing|align="top"}} = Ongoing |
Line 16: | Line 23: |
[[BachelorAndMasterProjectsAndTheses/SparqlTextUI|A User Interface for the QLever SPARQL+Text engine (project or thesis)]]: This is well-suited as a project (B.Sc. or M.Sc.) but also provides ample opportunity for continuation with a theses (B.Sc. or M.Sc). You should be fond of good user interfaces and have a good taste concerning layout and colors and such things. You should also like knowledge bases and big datasets and searching in them. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/bast|Hannah Bast]]. | Click on the titles for a more detailed description of the project, including background info, goals, and first steps. Note that for some topics, it can make sense if several people work on them concurrently. So ''ongoing'' does not necessarily mean, that the topic is no longer available. |
Line 18: | Line 25: |
[[BachelorAndMasterProjectsAndTheses/FrequencyMap|Generating Regular-Interval Maps from Public Transit Data (project or thesis, ongoing)]]: We are looking for well-structured input data for our [[http://panarea.informatik.uni-freiburg.de/gtfs-lines/|transit-map drawing algorithm]]. The goal of this topic is to provide the transit-mapper with a graph from which it can render regular-intervall timetable maps ("Taktfahrplankarten") like [[http://www.sma-partner.com/images/Downloads/NGCH-2015.pdf|this]]. You should be a fan of huge datasets, timetable data and of course graphs (but who isn't). As a bachelor or master project with the possibility of continuation as a thesis. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/brosi|Patrick Brosi]]. | {{attachment:available.png|Available|align="top"}} [[BachelorAndMasterProjectsAndTheses/TokenizationRepair|Tokenization Repair (project and/or thesis)]]: Interesting and well-defined problem, the solution of which is relevant in a variety of information retrieval scenarios. Simple rule-based solutions come to mind easily, but machine learning is key to get very good results. A background in machine learning, or a strong willingness to aquire one as part of the project/thesis, is therefore mandatory for this project. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/bast|Hannah Bast]]. |
Line 20: | Line 27: |
[[BachelorAndMasterProjectsAndTheses/GtfsMerger|Merging Overlapping GTFS Feeds (Bachelor project or thesis, ongoing)]]: Many transportation companies publish their timetable data either directly as GTFS feeds or in formats that can be converted to GTFS. As soon as you have two GTFS feeds (two sets of timetable data) that cover either the same or adjacent areas, the problem of duplicate trips arises. You should develop a tool that merges two or more GTFS feeds and solves duplication / fragmentation issues. As a bachelor project or thesis. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/brosi|Patrick Brosi]]. | {{attachment:available.png|Available|align="middle"}} [[BachelorAndMasterProjectsAndTheses/SparqlTextUI|A User Interface for the QLever SPARQL+Text engine (project or thesis)]]: This is well-suited as a project (B.Sc. or M.Sc.) but also provides ample opportunity for continuation with a theses (B.Sc. or M.Sc). You should be fond of good user interfaces and have a good taste concerning layout and colors and such things. You should also like knowledge bases and big datasets and searching in them. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/bast|Hannah Bast]]. |
Line 22: | Line 29: |
[[BachelorAndMasterProjectsAndTheses/GtfsBrowser|GTFS Browser Web App (Bachelor project or thesis)]]: Develop a web-application that can be used to analyze huge GTFS datasets. There are already some tools available (for example, [[https://github.com/google/transitfeed/wiki/ScheduleViewer|ScheduleViewer]]) but they all feel and look quite clumsy, are incredible slow and cannot handle large datasets. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/brosi|Patrick Brosi]]. | {{attachment:ongoing.png|Ongoing|align="bottom"}} [[BachelorAndMasterProjectsAndTheses/FrequencyMap|Generating Regular-Interval Maps from Public Transit Data (project or thesis, ongoing)]]: We are looking for well-structured input data for our [[http://panarea.informatik.uni-freiburg.de/gtfs-lines/|transit-map drawing algorithm]]. The goal of this topic is to provide the transit-mapper with a graph from which it can render regular-intervall timetable maps ("Taktfahrplankarten") like [[http://www.sma-partner.com/images/Downloads/NGCH-2015.pdf|this]]. You should be a fan of huge datasets, timetable data and of course graphs (but who isn't). As a bachelor or master project with the possibility of continuation as a thesis. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/brosi|Patrick Brosi]]. {{attachment:ongoing.png|Ongoing|align="top"}} [[BachelorAndMasterProjectsAndTheses/GtfsMerger|Merging Overlapping GTFS Feeds (Bachelor project or thesis, ongoing)]]: Many transportation companies publish their timetable data either directly as GTFS feeds or in formats that can be converted to GTFS. As soon as you have two GTFS feeds (two sets of timetable data) that cover either the same or adjacent areas, the problem of duplicate trips arises. You should develop a tool that merges two or more GTFS feeds and solves duplication / fragmentation issues. As a bachelor project or thesis. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/brosi|Patrick Brosi]]. {{attachment:available.png|Available|align="top"}} [[BachelorAndMasterProjectsAndTheses/GtfsBrowser|GTFS Browser Web App (Bachelor project or thesis)]]: Develop a web-application that can be used to analyze huge GTFS datasets. There are already some tools available (for example, [[https://github.com/google/transitfeed/wiki/ScheduleViewer|ScheduleViewer]]) but they all feel and look quite clumsy, are incredible slow and cannot handle large datasets. Supervised by [[https://ad.informatik.uni-freiburg.de/staff/brosi|Patrick Brosi]]. |
Line 25: | Line 36: |
In the first meeting with the supervisor, create a Google Doc where you copy and fill out the following template. The Google Doc must be named ''Firstname Lastname'' and it should be shared with at least the following people: Student, Supervisor(s), Hannah Bast (bast.hannah@gmail.com), Frank Dal-Ri (nirlad@gmail.com, our system adiminstrator), Heike Hägle (hhaegle@gmail.com, our secretary). |
== First meeting == In the first meeting with the supervisor, create a Google Doc where you copy and fill out the following template. The Google Doc must be named ''Firstname Lastname'' and it should be shared with at least the following people: Student, Supervisor(s), Hannah Bast ( bast.hannah@gmail.com ), Frank Dal-Ri ( nirlad@gmail.com , our system adiminstrator), Heike Hägle ( hhaegle@gmail.com , our secretary). |
Line 31: | Line 42: |
RZ Account: [initials + number, e.g. ''tf437''] Department Account: [usually the first seven letters of the given name + the initial of the first name, e.g. ''friedt''] |
RZ Account: [initials + number, e.g. ''es437''] Department Account: [usually the first seven letters of the given name + the initial of the first name, e.g. ''scissore''] |
Line 34: | Line 45: |
SVN: [subdirectory in student-projects or student-theses, named firstname-lastname, e.g. ''edward-scissorhands'']] | |
Line 38: | Line 50: |
Line 44: | Line 57: |
Line 47: | Line 59: |
== Follow-up meetings == It usually makes sense to have a few follow-up / progress meetings over the course of the projects. They usually take place, after certain milestones have been reached. Students should come well-prepared to these meetings. Here is a checklist: |
|
Line 48: | Line 62: |
1. Concise(!!) list of what has been done in he Google Doc (at the top) 1. Concise list of questions, if you have any 1. Code written so far should be in the SVN, not (only) on your local computer 1. All relevant files should be on our file system, not (only) on your local computer 1. A working demo of whatever you have done so far 1. The demo should run on our servers, not just locally on your computer 1. Bring your laptop (just in case there is uncommitted stuff) |
|
Line 49: | Line 70: |
== Completed projects and theses == |
= Completed projects and theses = |
Bachelor's and Master's Projects and Theses at the Chair for Algorithms and Data Structures. Note: This is a new page, created on 07-02-2017. The old page, which was outdated and in a terrible format, can still be found here.
Contents
Application for a project or thesis
If you are interested in doing a project or theses with us, please carefully read this page first and then send an e-mail to the prospective supervisor with the following information:
- Acknowledge that you have read this whole page (not just the first section) carefully.
- Which courses have you already taken with us.
- A very short description of your interests and your strengths (concerning work on a project/thesis).
- A transcript of the grades of the courses you have take so far.
- If you have an own project in mind (not necessary): a very short description of the goal and the scientific merit.
Deliverables of a project or thesis
B.Sc. and M.Sc. projects: (1) a project website; (2) well-documented version of your code and data, which allow reproducibility of your results, in our SVN. See the list at the end of this page for many examples (of project websites).
B.Sc. and M.Sc. thesis: (1) written thesis; (2) oral presentation (20 minutes + question); (3) well-documented version of your code and data, which allow reproducibility of your results, in our SVN. Here are some guidelines on how to write a good thesis. See the list at the end of this page for many examples (of submitted theses and the accompanying presentation).
Available and ongoing projects and theses
= Available; = Ongoing
Click on the titles for a more detailed description of the project, including background info, goals, and first steps. Note that for some topics, it can make sense if several people work on them concurrently. So ongoing does not necessarily mean, that the topic is no longer available.
Tokenization Repair (project and/or thesis): Interesting and well-defined problem, the solution of which is relevant in a variety of information retrieval scenarios. Simple rule-based solutions come to mind easily, but machine learning is key to get very good results. A background in machine learning, or a strong willingness to aquire one as part of the project/thesis, is therefore mandatory for this project. Supervised by Hannah Bast.
A User Interface for the QLever SPARQL+Text engine (project or thesis): This is well-suited as a project (B.Sc. or M.Sc.) but also provides ample opportunity for continuation with a theses (B.Sc. or M.Sc). You should be fond of good user interfaces and have a good taste concerning layout and colors and such things. You should also like knowledge bases and big datasets and searching in them. Supervised by Hannah Bast.
Generating Regular-Interval Maps from Public Transit Data (project or thesis, ongoing): We are looking for well-structured input data for our transit-map drawing algorithm. The goal of this topic is to provide the transit-mapper with a graph from which it can render regular-intervall timetable maps ("Taktfahrplankarten") like this. You should be a fan of huge datasets, timetable data and of course graphs (but who isn't). As a bachelor or master project with the possibility of continuation as a thesis. Supervised by Patrick Brosi.
Merging Overlapping GTFS Feeds (Bachelor project or thesis, ongoing): Many transportation companies publish their timetable data either directly as GTFS feeds or in formats that can be converted to GTFS. As soon as you have two GTFS feeds (two sets of timetable data) that cover either the same or adjacent areas, the problem of duplicate trips arises. You should develop a tool that merges two or more GTFS feeds and solves duplication / fragmentation issues. As a bachelor project or thesis. Supervised by Patrick Brosi.
GTFS Browser Web App (Bachelor project or thesis): Develop a web-application that can be used to analyze huge GTFS datasets. There are already some tools available (for example, ScheduleViewer) but they all feel and look quite clumsy, are incredible slow and cannot handle large datasets. Supervised by Patrick Brosi.
Instructions for students and supervisors
First meeting
In the first meeting with the supervisor, create a Google Doc where you copy and fill out the following template. The Google Doc must be named Firstname Lastname and it should be shared with at least the following people: Student, Supervisor(s), Hannah Bast ( bast.hannah@gmail.com ), Frank Dal-Ri ( nirlad@gmail.com , our system adiminstrator), Heike Hägle ( hhaegle@gmail.com , our secretary).
Short working title of project/thesis: [this may change later] RZ Account: [initials + number, e.g. ''es437''] Department Account: [usually the first seven letters of the given name + the initial of the first name, e.g. ''scissore''] Primary e-mail adress: SVN: [subdirectory in student-projects or student-theses, named firstname-lastname, e.g. ''edward-scissorhands'']] Special RAM requirements: Special Disk space requirements: Actual beginning of work: Planned end of work: Goal: Subgoals: First steps: Deadline for first steps:
The first steps should be something relative fail-safe which can just "be done". If the first steps are not even remotely finished by the agreed-upon deadline, we might not want to supervise you further. Supervision is a considerable effort for us and we care a lot about it. In particular, we spent a lot of thought and effort on formulating interesting topics (and goals, and subgoals, and first steps) which are interesting and relevant and related to our actual research. We care about the results of these projects/theses, so you should also care working on them.
Follow-up meetings
It usually makes sense to have a few follow-up / progress meetings over the course of the projects. They usually take place, after certain milestones have been reached. Students should come well-prepared to these meetings. Here is a checklist:
- Concise(!!) list of what has been done in he Google Doc (at the top)
- Concise list of questions, if you have any
- Code written so far should be in the SVN, not (only) on your local computer
- All relevant files should be on our file system, not (only) on your local computer
- A working demo of whatever you have done so far
- The demo should run on our servers, not just locally on your computer
- Bring your laptop (just in case there is uncommitted stuff)
Completed projects and theses
List of completed B.Sc. and M.Sc. theses (each with the written thesis and presentation)
List of completed B.Sc. and M.Sc. projects (each with a project website)