7212
Comment:
|
15918
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
This seminar is organized at the chair of [[http://ad.informatik.uni-freiburg.de/staff/bast|Prof. Dr. Hannah Bast]] by [[http://ad.informatik.uni-freiburg.de/staff/klumpp|Theresa Klumpp]], [[http://ad.informatik.uni-freiburg.de/staff/hertel|Matthias Hertel]] and [[http://ad.informatik.uni-freiburg.de/staff/prange|Natalie Prange]]. The seminar will take place every Tuesday, 2:15 pm - 3:45 pm, in the seminar room SR 00-010/14 in building 101 and via Zoom for those of you who want to attend online (details will follow). '''Attendance in one of these two forms is compulsory.''' There will be '''no''' session on Tuesday, December 29th, 2020 and Tuesday, January 5th, 2021 (Christmas break). | {{{#!html <!-- <p style="color:darkred">The kick-off meeting took place on <b>Wednesday, November 4, 10:15 am - 11:45 am</b> (not Tuesday, the regular meeting day in the following weeks). You can attend via Zoom (see below for link and password). The talk will be recorded for those who cannot attend.</p> --> }}} This seminar is organized at the chair of [[http://ad.informatik.uni-freiburg.de/staff/bast|Prof. Dr. Hannah Bast]] by [[http://ad.informatik.uni-freiburg.de/staff/klumpp|Theresa Klumpp]], [[http://ad.informatik.uni-freiburg.de/staff/hertel|Matthias Hertel]] and [[http://ad.informatik.uni-freiburg.de/staff/prange|Natalie Prange]]. The seminar will take place every Tuesday, 2:15 pm - 3:45 pm, in the seminar room SR 00-010/14 in building 101 (if the COVID-19 conditions allow it) and via Zoom for those of you who want to attend online (see below for link and password). '''Attendance in one of these two forms is compulsory.''' There will be '''no''' session on Tuesday, November 10th, 2020 (seminar places are assigned by the HISinOne in that week) and on Tuesday, December 29th, 2020 + Tuesday, January 5th, 2021 (Christmas break). |
Line 5: | Line 12: |
Please register for the course in [[https://campus.uni-freiburg.de/qisserver/pages/cm/exa/coursemanagement/basicCourseData.xhtml?_flowId=searchCourseNonStaff-flow&_flowExecutionKey=e1s3|HISinOne]] '''and''' [[https://daphne.informatik.uni-freiburg.de/login/?service=/ws2021/DeepNlp/|Daphne]]. {{{#!html <!-- # If you have questions (both general or specific to your topic), please ask a question on the [[https://daphne.informatik.uni-freiburg.de/forum/viewforum.php?f=774/|forum]] after registering for [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/|Daphne]]. --> }}} There is an introduction to SVN in [[http://ad-wiki.informatik.uni-freiburg.de/teaching/SVNEnglish|english]] and [[http://ad-wiki.informatik.uni-freiburg.de/teaching/SVN|german]]. |
* Please register for the course in [[https://campus.uni-freiburg.de/qisserver/pages/cm/exa/coursemanagement/basicCourseData.xhtml?_flowId=searchCourseNonStaff-flow&_flowExecutionKey=e1s3|HISinOne]] '''and''' [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/|Daphne]]. * There is an introduction to SVN in [[http://ad-wiki.informatik.uni-freiburg.de/teaching/SVNEnglish|English]] and [[http://ad-wiki.informatik.uni-freiburg.de/teaching/SVN|German]]. * There is a [[https://daphne.informatik.uni-freiburg.de/forum/viewforum.php?f=1094|forum]] for important announcements and questions you might have. '''Please check the forum regularly!''' * The Zoom meeting we will use throughout the seminar: [[https://uni-freiburg.zoom.us/j/87399563594|https://uni-freiburg.zoom.us/j/87399563594]] (Password: DeepNLP20) |
Line 20: | Line 22: |
== Schedule for each individual presentation == * '''Before Meeting 1:''' Research the given topic (starting from the pointers given below) and make a plan of what you want to talk about * '''Meeting 1 (3 weeks before your presentation):''' show us your plan + we settle on the scope of your presentation * '''Before Meeting 2:''' Understand / work out all the necessary details and play around (extensively) with existing software or write your own * '''Meeting 2 (2 weeks before your presentation):''' show us what you have done + we try to help with remaining problems * '''Before Meeting 3:''' Prepare your presentation and the demos you want to show * '''Meeting 3 (1 week before your presentation):''' show us what you have prepared + we help with remaining problems * '''Before your presentation:''' Finish your presentation and demo, including all the details |
|
Line 22: | Line 35: |
||1 ||Tuesday, November 3rd, 2020 ||'''Introduction and Organization''' (by Prof. Hannah Bast) || || ||Tuesday, November 10th, 2020 ||'''NO SESSION''' || ||2 ||Tuesday, November 17th, 2020 ||'''Machine Learning Introduction''' (by us) || ||3 ||Tuesday, November 24th, 2020 ||'''Deep Learning & Tensorflow Introduction''' (by us) || ||4 ||Tuesday, December 1st, 2020 ||'''Standard Language Model''' || ||5 ||Tuesday, December 8th, 2020 ||'''RNN Language Model''' || ||6 ||Tuesday, December 15th, 2020 ||'''word2vec''' || ||7 ||Tuesday, December 22nd, 2020 ||'''?''' || || ||Tuesday, December 29th, 2020 ||'''NO SESSION''' || || ||Tuesday, January 5th, 2021 ||'''NO SESSION''' || ||8 ||Tuesday, January 12th, 2021 ||'''?''' || ||9 ||Tuesday, January 19th, 2021 ||'''?''' || ||10 ||Tuesday, January 26th, 2021 ||'''?''' || ||11 ||Tuesday, February 2nd, 2021 ||'''?''' || ||12 ||Tuesday, February 9th, 2021 ||'''?''' || |
||1 ||Wednesday, November 4th, 2020, 10:15 am - 11:45 am ||'''Introduction and Organization''' (by Prof. Hannah Bast), [[https://www.youtube.com/watch?v=39TVPoYt1WY|Video Recording]] ([[https://ad-teaching.informatik.uni-freiburg.de/SeminarDeepNlpWS2021/session-01.mp4|MP4 Download]]) [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/svn/public/slides/session-01.pdf|Slides]] || || ||Tuesday, November 10th, 2020 ||'''*** NO SESSION ***''' || ||2 ||Tuesday, November 17th, 2020 ||'''Machine Learning Introduction''' (by Theresa Klumpp), [[https://www.youtube.com/watch?v=rc_9ouZNBpk|Video Recording]] ([[https://ad-teaching.informatik.uni-freiburg.de/SeminarDeepNlpWS2021/session-02.mp4|MP4 Download]]) [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/svn/public/slides/session-02.pdf|Slides]] [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/svn/public/code/session-02/|Code]] || ||3 ||Tuesday, November 24th, 2020 ||'''Deep Learning & !PyTorch Introduction''' (by Matthias Hertel and Natalie Prange), [[https://youtu.be/DEt0qciKdJ0|Video Recording]] ([[https://ad-teaching.informatik.uni-freiburg.de/SeminarDeepNlpWS2021/session-03.mp4|MP4 Download]]) [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/svn/public/slides/session-03-p1.pdf|Slides Part 1]] [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/svn/public/slides/session-03-p2.pdf|Part 2]] [[https://colab.research.google.com/drive/1IIUIOs7v_WZyC2Po9Hm6Q6bnBsoGs-fP?usp=sharing|Code Part 1]] [[https://colab.research.google.com/drive/1euefD6vgaMHlfbqOgQLXdXU1aiVUbR2S?usp=sharing|Part 2]] || || ||Tuesday, December 1st, 2020 ||'''*** NO SESSION ***''' || ||4 ||Tuesday, December 8th, 2020 ||'''Standard Language Model''', [[https://daphne.informatik.uni-freiburg.de/ws2021/DeepNlp/svn/public/slides/session-04.pdf|Slides]]|| ||5 ||Tuesday, December 15th, 2020 ||'''RNN Language Model''' || ||6 ||Tuesday, December 22nd, 2020 ||'''!Word2Vec''' || || ||Tuesday, December 29th, 2020 ||'''*** NO SESSION ***''' || || ||Tuesday, January 5th, 2021 ||'''*** NO SESSION ***''' || ||7 ||Tuesday, January 12th, 2021 ||'''Attention & Transformer models''' || ||8 ||Tuesday, January 19th, 2021 ||'''Bidirectional Encoder Representations from Transformers (BERT)''' || ||9 ||Tuesday, January 26th, 2021 ||'''Machine Translation''' || ||10 ||Tuesday, February 2nd, 2021 ||'''Text Classification''' || ||11 ||Tuesday, February 9th, 2021 ||'''Convolutional Neural Networks for NLP''' || ||12 ||Tuesday, February 16th, 2021 ||'''Named Entity Disambiguation & Automatic Hyperparameter Optimization''' || |
Line 38: | Line 52: |
{{{#!html <!-- # |
|
Line 48: | Line 60: |
* Modelling natural languages using unigrams, n-grams and statistics * Material: * [[https://web.stanford.edu/class/cs124/lec/languagemodeling.pdf|Lecture Notes from Stanford]] |
* Having a model for natural language is the base for many NLP tasks. N-gram models are a simple way to obtain such a model. * [[https://web.stanford.edu/class/cs124/lec/languagemodeling.pdf|Stanford lecture notes]] * [[https://www.youtube.com/watch?v=Saq1QagC8KY&list=PLQiyVNMpDLKnZYBTUOlSI9mi9wAErFtFm&index=12|Stanford lecture]] * [[http://www.nltk.org/index.html|NLTK Python toolkit for implementation]] |
Line 52: | Line 66: |
* Character based language modelling with Deep Neural Networks * Material: * [[http://karpathy.github.io/2015/05/21/rnn-effectiveness/|Andrej Karpathy - The Unreasonable Effectiveness of Recurrent Neural Networks]] |
* Recurrent Neural Network (RNN) language models are in general superior to n-gram language models because they can model long-term dependencies. Explain RNNs, LSTM Networks and how they are used to model language. * [[https://youtu.be/SEnXr6v2ifU|MIT lecture]] * [[https://adventuresinmachinelearning.com/recurrent-neural-networks-lstm-tutorial-tensorflow/|Blog post]] |
Line 56: | Line 71: |
* Representing words as vectors from a high dimensional vector space * Material: * [[https://arxiv.org/abs/1301.3781|Original Paper by Mikolov et al]] * [[https://www.youtube.com/watch?v=wTp3P2UnTfQ|YouTube Video Explanation]] * [[https://radimrehurek.com/gensim/|Gensim framework for word vector generation]] * '''Paraphrasing and Synonyms''' * Challenges and approaches for handling the ambiguity of natural language * Material: * [[http://www.lrec-conf.org/proceedings/lrec2012/pdf/266_Paper.pdf|CrossWikis Wikipedia extracted synonyms]] * [[http://aclweb.org/anthology/P/P13/P13-1158.pdf|Question Paraphrasing]] (advanced) |
* Word2vec is a technique to represent words as vectors from a high dimensional vector space. The goal is that words that are semantically similar have similar vectors. This is a central method in many NLP problems. * [[https://arxiv.org/pdf/1301.3781.pdf|The original paper that introduced Word2Vec in 2013]] * [[https://youtu.be/ERibwqs9p38|Stanford lecture]] * [[https://radimrehurek.com/gensim/|Python framework for word vector generation]] * '''Attention''' * With the attention mechanism, a neural network can learn to focus on specific parts of the input. This has applications in Machine Translation, Language Modeling, Image Captioning and many more. * [[https://arxiv.org/pdf/1409.0473|Jointly learning to align and translate]] * [[https://arxiv.org/pdf/1508.04025|Effective approaches to attention]] * Application to Image Captioning: [[https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Vinyals_Show_and_Tell_2015_CVPR_paper.pdf|Paper 1]] and [[http://proceedings.mlr.press/v37/xuc15.pdf|Paper 2]] * '''Transformer models''' * A new neural network architecture achieving state-of-the-art results in many NLP tasks. It is used by OpenAI in their famous GPT-2 paper to automatically generate text that is almost indistinguishable from human-written text. * [[https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf|Attention is all you need]] * [[https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf|Generative Pre-Training (GPT-2)]] * '''Bidirectional Encoder Representations from Transformers (BERT)''' * BERT is a method for NLP pre-training based on the Transformer architecture. It is successfully applied in a large variety of NLP tasks. * [[https://arxiv.org/pdf/1810.04805.pdf|Original paper]] * [[https://www.analyticsvidhya.com/blog/2019/09/demystifying-bert-groundbreaking-nlp-framework/|Blog post]] * [[https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270|Blog post]] |
Line 67: | Line 94: |
* A neural network architecture from image processing applied to NLP * Material: * [[http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/|Introductory Blog Post]] * [[https://github.com/facebookresearch/fairseq|CNN based Machine Translation by Facebook]] (advanced) |
* While originally stemming from Image Analysis, Convolutional Neural Networks also have their applications in Natural Language Processing. * [[https://arxiv.org/pdf/1702.01923.pdf|Comparison of CNN and RNN for NLP]] * [[https://arxiv.org/pdf/1705.03122.pdf|Convolutional Sequence to Sequence Learning]] |
Line 72: | Line 99: |
* Classifying text using Machine Learning: Approaches and Techniques * Material: * [[https://medium.com/towards-data-science/machine-learning-nlp-text-classification-using-scikit-learn-python-and-nltk-c52b92a7c73a|Tutorial]] |
* The goal is to classify text using Machine Learning. Examples of NLP-Applications are sentiment analysis, topic labeling or spam detection. * [[https://arxiv.org/pdf/1904.08067v4.pdf|Survey that covers many algorithms and methods on text classification]] * [[https://medium.com/towards-data-science/machine-learning-nlp-text-classification-using-scikit-learn-python-and-nltk-c52b92a7c73a|Hands-on guide to implementation]] |
Line 76: | Line 104: |
* Using NLP for emotion extraction: Approaches and Techniques * Material: * [[https://blog.openai.com/unsupervised-sentiment-neuron/|Accidentally discovered state of the art sentiment analysis by OpenAI]] * '''Attention''' * Attending to detail for Neural Networks * Material: * [[https://talbaumel.github.io/attention/|Learning string.reverse() with Attention]] * [[http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/|Introductory Blog Post]] |
* The task of identifying and analyzing opinions about entities and their aspects in text. * [[https://onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1253?casa_token=wZouw4KzngoAAAAA:Lid-gQeumD8iOGqVisIYYUtvWJXxyXpbp476HIrR5j9H6FHkcADeXFUGvOkBwZk0K1-_LPUxHV1AdWc|Survey paper]] * [[https://dl.acm.org/doi/pdf/10.1145/2766462.2767830?casa_token=rmuyx1FQ258AAAAA:h3svnql7GlOHAtHZkvL4t3Rb7KOtln5YQtXJKHiYXBFjgNqkmo2qAFoD_DOLK_Lk45wzMy0TC2LV|Popular but not so recent paper]] |
Line 85: | Line 109: |
* Finding answers in a heap of text * Material: * [[https://rajpurkar.github.io/SQuAD-explorer/|SQuad Benchmark]] |
* The task of extracting an answer to a given question from a given document/paragraph (reading comprehension) or a large set of documents like Wikipedia (open domain question answering). The most prominent dataset for reading comprehension tasks is currently the SQuAD dataset. * [[https://youtu.be/yIdF-17HwSk?list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z|Stanford lecture]] * [[https://rajpurkar.github.io/SQuAD-explorer/|SQuAD explorer]] |
Line 89: | Line 114: |
* Finding answers in a warehouse of facts * Material: * [[https://www.microsoft.com/en-us/research/publication/semantic-parsing-via-staged-query-graph-generation-question-answering-with-knowledge-base/|Microsoft Research QA system STAGG]] * [[http://ad-publications.informatik.uni-freiburg.de/CIKM_freebase_qa_BH_2015.pdf|Our QA system Aqqu (code on request)]] |
* The task of extracting an answer to a given questions from a knowledge base like [[https://www.wikidata.org/|Wikidata]]. * [[https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/ACL15-STAGG.pdf|Microsoft Research QA system STAGG]] * [[https://ad-publications.cs.uni-freiburg.de/CIKM_freebase_qa_BH_2015.pdf|Our QA system Aqqu (code on request)]] * [[https://aqqu.cs.uni-freiburg.de/|Try out Aqqu]] |
Line 94: | Line 119: |
* '''Machine Translation''' * In the last years, machine translation systems like Google Translate and DeepL have made big progress. We will look at such a system in detail, and see how it is even possible to translate between language pairs the model has never seen during training. * [[https://arxiv.org/pdf/1609.08144.pdf%20(7.pdf|Google Translate]] * [[https://arxiv.org/pdf/1611.04558v1.pdf and https://arxiv.org/pdf/1601.01073.pdf|Multilingual Translation]] * Is machine translation better than humans? [[https://www.skynettoday.com/editorials/state_of_nmt|Blogpost 1]] and [[https://www.theatlantic.com/technology/archive/2018/01/the-shallowness-of-google-translate/551570/|Blogpost 2]] * '''Bias in Language Models''' * Language Models can only be as good as the input we give them ([[https://en.wikipedia.org/wiki/Garbage_in,_garbage_out|“Garbage in, garbage out”]]). If the input data is biased, the models will mimic that behavior. [[https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G|This can lead to real life problems.]] * [[https://science.sciencemag.org/content/sci/356/6334/183.full.pdf|Identifying (different kinds of) bias]] * [[https://arxiv.org/pdf/1906.08976.pdf|This literature review]] gives a good overview of gender bias in different areas of NLP and contains lots of other papers and resources. * '''Named Entity Disambiguation''' * Named Entity Disambiguation, also referred to as entity linking, is the task of linking named entities in text to their corresponding entries in a knowledgebase like Wikidata or Wikipedia. * [[https://ambiversenlu.mpi-inf.mpg.de/|Ambiverse Entity Disambiguation Demo]] ([[https://www.aclweb.org/anthology/D11-1072.pdf|Corresponding Paper]]) * [[https://arxiv.org/pdf/2006.00575.pdf|Survey of Neural Entity Linking]] * '''Reinforcement Learning''' * RL has many applications in NLP. You should pick one or two that you are interested in and focus on them. * Examples: [[https://arxiv.org/pdf/1506.08941.pdf|Text-based games]], [[https://arxiv.org/pdf/1606.01541.pdf|Dialogue Generation]] or [[https://arxiv.org/pdf/1709.00103.pdf|Question Answering]] * '''Automatic Hyperparameter Optimization''' * When designing and training neural networks, many decisions have to be taken about the network architecture and training process, which affect the final outcome. Manually tuning these decisions is a tedious task. Recent work automates the process of finding the optimal setting. * [[http://ceur-ws.org/Vol-2458/paper5.pdf|A study of the impact of various hyperparameters on a text classification task]] * [[https://arxiv.org/pdf/1807.01774.pdf|BOHB: Bayesian Optimization and Hyperband]] * '''More possible topics''' * Speech Recognition (a.k.a. Speech to text) * Text Summarization * Chat bots & dialogues * Parsing & semantic role labeling * Transfer Learning and Multi-task learning with the Text-to-text transfer transformer (T5): [[https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html|Blogpost]] and [[https://www.jmlr.org/papers/volume21/20-074/20-074.pdf|Paper]] * [[https://ieeexplore.ieee.org/document/7887639|Personality Detection from Text]] * Fake News Detection {{{#!html <!-- # |
Welcome to the Wiki of the seminar ''Deep Natural Language Processing'' in the winter semester 2020/2021
This seminar is organized at the chair of Prof. Dr. Hannah Bast by Theresa Klumpp, Matthias Hertel and Natalie Prange. The seminar will take place every Tuesday, 2:15 pm - 3:45 pm, in the seminar room SR 00-010/14 in building 101 (if the COVID-19 conditions allow it) and via Zoom for those of you who want to attend online (see below for link and password). Attendance in one of these two forms is compulsory. There will be no session on Tuesday, November 10th, 2020 (seminar places are assigned by the HISinOne in that week) and on Tuesday, December 29th, 2020 + Tuesday, January 5th, 2021 (Christmas break).
Important Links
There is a forum for important announcements and questions you might have. Please check the forum regularly!
The Zoom meeting we will use throughout the seminar: https://uni-freiburg.zoom.us/j/87399563594 (Password: DeepNLP20)
Modalities
Participants of the seminar will have to present one of the topics either alone or as a group of two. Each presentation will be 30 minutes for one participant or 2 * 20 minutes for two. In addition to introducing the topic each presentation must include a demo part where participants present a practical application of their topic.
What exactly this demo entails depends on the topic and will be discussed with each person/team separately. While we will provide suggestions you are very welcome to bring in your own ideas. Examples for demos may include the implementation of a small application, an interactive visualization or the demonstration of a complex existing system which you have set up on your own.
Schedule for each individual presentation
Before Meeting 1: Research the given topic (starting from the pointers given below) and make a plan of what you want to talk about
Meeting 1 (3 weeks before your presentation): show us your plan + we settle on the scope of your presentation
Before Meeting 2: Understand / work out all the necessary details and play around (extensively) with existing software or write your own
Meeting 2 (2 weeks before your presentation): show us what you have done + we try to help with remaining problems
Before Meeting 3: Prepare your presentation and the demos you want to show
Meeting 3 (1 week before your presentation): show us what you have prepared + we help with remaining problems
Before your presentation: Finish your presentation and demo, including all the details
Sessions
Session |
Date |
Topic |
1 |
Wednesday, November 4th, 2020, 10:15 am - 11:45 am |
Introduction and Organization (by Prof. Hannah Bast), Video Recording (MP4 Download) Slides |
|
Tuesday, November 10th, 2020 |
*** NO SESSION *** |
2 |
Tuesday, November 17th, 2020 |
Machine Learning Introduction (by Theresa Klumpp), Video Recording (MP4 Download) Slides Code |
3 |
Tuesday, November 24th, 2020 |
Deep Learning & PyTorch Introduction (by Matthias Hertel and Natalie Prange), Video Recording (MP4 Download) Slides Part 1 Part 2 Code Part 1 Part 2 |
|
Tuesday, December 1st, 2020 |
*** NO SESSION *** |
4 |
Tuesday, December 8th, 2020 |
Standard Language Model, Slides |
5 |
Tuesday, December 15th, 2020 |
RNN Language Model |
6 |
Tuesday, December 22nd, 2020 |
Word2Vec |
|
Tuesday, December 29th, 2020 |
*** NO SESSION *** |
|
Tuesday, January 5th, 2021 |
*** NO SESSION *** |
7 |
Tuesday, January 12th, 2021 |
Attention & Transformer models |
8 |
Tuesday, January 19th, 2021 |
Bidirectional Encoder Representations from Transformers (BERT) |
9 |
Tuesday, January 26th, 2021 |
Machine Translation |
10 |
Tuesday, February 2nd, 2021 |
Text Classification |
11 |
Tuesday, February 9th, 2021 |
Convolutional Neural Networks for NLP |
12 |
Tuesday, February 16th, 2021 |
Named Entity Disambiguation & Automatic Hyperparameter Optimization |
Topics
The topics are going to be introduced and roughly explained in the first session. They are basically about how Deep Learning can be used in Natural Language Processing.
Please note that you are supposed to present the topic, not the material listed here. The material is only intended as a starting point for your research.
Standard Language Model
- Having a model for natural language is the base for many NLP tasks. N-gram models are a simple way to obtain such a model.
RNN Language Model
- Recurrent Neural Network (RNN) language models are in general superior to n-gram language models because they can model long-term dependencies. Explain RNNs, LSTM Networks and how they are used to model language.
word2vec
- Word2vec is a technique to represent words as vectors from a high dimensional vector space. The goal is that words that are semantically similar have similar vectors. This is a central method in many NLP problems.
Attention
- With the attention mechanism, a neural network can learn to focus on specific parts of the input. This has applications in Machine Translation, Language Modeling, Image Captioning and many more.
Transformer models
- A new neural network architecture achieving state-of-the-art results in many NLP tasks. It is used by OpenAI in their famous GPT-2 paper to automatically generate text that is almost indistinguishable from human-written text.
Bidirectional Encoder Representations from Transformers (BERT)
- BERT is a method for NLP pre-training based on the Transformer architecture. It is successfully applied in a large variety of NLP tasks.
Convolutional Neural Networks for NLP
- While originally stemming from Image Analysis, Convolutional Neural Networks also have their applications in Natural Language Processing.
Text Classification
- The goal is to classify text using Machine Learning. Examples of NLP-Applications are sentiment analysis, topic labeling or spam detection.
Survey that covers many algorithms and methods on text classification
Sentiment Analysis
- The task of identifying and analyzing opinions about entities and their aspects in text.
Question Answering on Text
- The task of extracting an answer to a given question from a given document/paragraph (reading comprehension) or a large set of documents like Wikipedia (open domain question answering). The most prominent dataset for reading comprehension tasks is currently the SQuAD dataset.
Question Answering on Knowledge Bases
The task of extracting an answer to a given questions from a knowledge base like Wikidata.
Machine Translation
- In the last years, machine translation systems like Google Translate and DeepL have made big progress. We will look at such a system in detail, and see how it is even possible to translate between language pairs the model has never seen during training.
Is machine translation better than humans? Blogpost 1 and Blogpost 2
Bias in Language Models
Language Models can only be as good as the input we give them (“Garbage in, garbage out”). If the input data is biased, the models will mimic that behavior. This can lead to real life problems.
This literature review gives a good overview of gender bias in different areas of NLP and contains lots of other papers and resources.
Named Entity Disambiguation
- Named Entity Disambiguation, also referred to as entity linking, is the task of linking named entities in text to their corresponding entries in a knowledgebase like Wikidata or Wikipedia.
Reinforcement Learning
- RL has many applications in NLP. You should pick one or two that you are interested in and focus on them.
Examples: Text-based games, Dialogue Generation or Question Answering
Automatic Hyperparameter Optimization
- When designing and training neural networks, many decisions have to be taken about the network architecture and training process, which affect the final outcome. Manually tuning these decisions is a tedious task. Recent work automates the process of finding the optimal setting.
A study of the impact of various hyperparameters on a text classification task
More possible topics