= Accurate Word Extraction from Documents with Complex Layouts =
'''Type''': An interesting and practical bachelor thesis. A basic understanding of Machine Learning''' is required'''; knowledge of Deep Learning is desirable. The preferred programming language is Python.
'''Background info''': TODO
'''Goal''': Merging hyphenated words by using machine learning techniques; taking into account that a word can be a compound word, in which case the hyphen between the two parts of the word needs to be retained on merging the parts.
''Challenge 1'': TODO
''Challenge 2'': TODO
'''Subgoal 1''': TODO
'''Subgoal 2''': TODO
<
><
>
Supervision by [[http://ad.informatik.uni-freiburg.de/staff/korzen|Claudius Korzen]].