'''Your interests/skills:''' Deep Learning, Natural Language Processing, Neural Machine Translation Very recently, big improvements in neural machine translation were achieved with encoder-decoder networks with attention and Transformer networks. Like machine translation, spelling correction is a sequence-to-sequence transformation task, where the input language is misspelled English and the target language is correctly spelled English. In this project or thesis, you are going to solve this task with deep learning. While the idea looks very promising, there are many details one has to get right to achieve good performance: finding (or generating) good training data, choosing the right representations for the input and target language and many hyperparameters for the model design and training. You should be familiar with at least one deep learning framework for python (!TensorFlow or pytorch). '''Step 1:''' Start simple: take a text corpus of your choice and inject a few random characters into each sentence. Build and train a small encoder-decoder network that learns to remove the wrong characters from the sentences.<
> '''Steps 2 to N:''' From here you can proceed in many directions. Extend the types of randomized errors (e.g. character deletions, replacements and swaps, as well as merges and splits of words) and find collections of real misspellings to train and evaluate on; try different models (recurrent models without and with attention, Transformer) and optimize their hyperparameters; test different input and output representations (characters, subwords, words or embeddings); and many more. '''Goal:''' In the end you will need to compare the performance of your approach to at least one baseline method and one existing spell checker, and provide insight into what works well (or not) and ideally why it works (or why it does not).