AD Teaching Wiki:

Accurate Word Extraction from Documents with Complex Layouts

Type: An interesting and practical bachelor thesis. A basic understanding of Machine Learning is required; knowledge of Deep Learning is desirable. The preferred programming language is Python.

Background info: TODO

Goal: Merging hyphenated words by using machine learning techniques; taking into account that a word can be a compound word, in which case the hyphen between the two parts of the word needs to be retained on merging the parts.

Challenge 1: TODO

Challenge 2: TODO

Subgoal 1: TODO

Subgoal 2: TODO

Supervision by Claudius Korzen.

AD Teaching Wiki: BachelorAndMasterProjectsAndTheses/MergingHyphenatedWords (last edited 2019-05-05 21:17:18 by adpult)