Welcome to the Wiki of the course "Information Retrieval" in the winter term 2019/2020
Results of the official evaluation of this course
The course is given by Prof. Dr. Hannah Bast and assisted by Patrick Brosi. It takes place every Tuesday from 14:15 - 15:45 h in SR 101-00-010/14. The first lecture is on Tuesday 22.10.2019 and the last lecture is on Tuesday 04.02.2020. There will be no lectures on the Tuesdays 24.12.2019 and 31.12.2019 (christmas break) and on Tuesday 11.02.2019 (we finish a week early). This is 14 lectures altogether.
The tutors for this course are Daniel Bindemann, Claudius Korzen, Julian Tischner and Andreas Woitzik. The administrator of the supporting systems (Daphne, SVN, Forum, Jenkins) is Axel Lehmann.
Important Links
Our course management system Daphne (login with your RZ user name and password).
The forum for announcements and questions.
The manual for how to ask questions on the forum.
Our CodingStandards for C++, Java and Python 3.
Information about Subversion (SVN) can be found here (in german) and here (in english); about some editors (including Vim) here (in german); and about the installation of gtest (for C++ only) here (in english).
The courses from previous semesters: Information Retrieval WS 18/19, Information Retrieval WS 17/18, Information Retrieval WS 16/17, Information Retrieval WS 15/16, Information Retrieval WS 13/14, Information Retrieval WS 12/13.
The exams from previous semesters: WS 18/19, WS 17/18, WS 16/17, WS 15/16, WS 13/14, WS 12/13.
Here you can download our Linux Image (for Virtualbox or VMWare).
A cheat sheet for NumPy and SciPy can be found here.
Lecture Slides, Video Recordings, Exercise Sheets, and Code
Lecture 1, Tuesday, October 22, 2019 (Introduction, Inverted Index, Zipf's Law): Video Recording (MP4 Download), Slides, Exercise Sheet 1, Code from the lecture + unit tests for Exercise Sheet 1, Dataset for ES1 (189,897 movies with title + description), Solution.
Lecture 2, Tuesday, October 29, 2019 (Ranking and Evaluation): Video Recording (MP4 Download), Slides, Exercise Sheet 2, Code from the lecture + unit tests for Exercise Sheet 2, NEW Movies Dataset (107,769 movies with title + longer description, Movies Training Benchmark, Movies Testing Benchmark, Table for your ranking results, Solution.
Lecture 3, Tuesday, November 5th, 2019 (Efficient List Intersection, Lagrange Multipliers): Video Recording (MP4 Download), Slides, Exercise Sheet 3, Solution.
Lecture 4, Tuesday, November 12th, 2019 (Compression, Codes, Entropy): Video Recording (MP4 Download), Slides, Exercise Sheet 4, Solution.
Lecture 5, Tuesday, November 19th, 2019 (Fuzzy Search, Edit Distance, q-Gram Index): Video Recording (MP4 Download), Slides, Exercise Sheet 5, Code from the lecture + unit tests for Exercise Sheet 5, Wikidata Entities (2,920,180 entities, name + popularity + description + additional information), Wikidata Entities SMALL (100,000 entities), Table for your results, Solution.
Lecture 6, Tuesday, November 26th, 2019 (Web applications, Part 1): Video Recording (MP4 Download), Slides, Exercise Sheet 6, Code skeleton + test queries for Exercise Sheet 6, Wikidata Entities (same as for ES 5), Solution.
Lecture 7, Tuesday, December 3rd, 2019 (Web applications, Part 2): Video Recording (MP4 Download), Slides, Exercise Sheet 7, Code skeleton + test queries for Exercise Sheet 7, Wikidata Entities (same as for ES 5 + 6), Solution.
Lecture 8, Tuesday, December 12th, 2019 (Vector Space Model): Video Recording (edited version from WS 17/18) (MP4 Download), Slides, Code from the lecture, Code skeleton for ES 8, Exercise Sheet 8, Movies Dataset (same one as for ES2), Movies testing benchmark (same one as for ES2), Table for your ranking results, Solution.
Lecture 9, Tuesday, December 17th, 2019 (Latent Semantic Indexing): Video Recording (MP4 Download), Slides, Code from the lecture, Exercise Sheet 9, Solution.
Lecture 10, Tuesday, January 7th, 2020 (Classification, Naive Bayes): Video Recording (MP4 Download), Slides, Code Template for ES 10, Table for your F-measures, Datasets for ES10: Film Genres (train, Variant 1), Film Genres (test, Variant 1), Film Genres (train, Variant 2), Film Genres (test, Variant 2), Film Ratings (train, Variant 1), Film Ratings (test, Variant 1), Film Ratings (train, Variant 2), Film Ratings (test, Variant 2) Exercise Sheet 10, Solution.
Lecture 11, Tuesday, January 14th, 2020 (Linear Classifiers, Perceptrons, Logistic Regression): Video Recording (MP4 Download), Slides, Code Template for ES 11, Table for your results, Datasets for ES11 (same as for ES 10): Film Genres (train, Variant 1), Film Genres (test, Variant 1), Film Genres (train, Variant 2), Film Genres (test, Variant 2), Film Ratings (train, Variant 1), Film Ratings (test, Variant 1), Film Ratings (train, Variant 2), Film Ratings (test, Variant 2), Exercise Sheet 11, Solution.
Lecture 12, Tuesday, January 21, 2020 (Knowledge Bases, SPARQL, Translation to SQL): Video Recording (MP4 Download), Slides, Exercise Sheet 12, Code Template for ES12, Datasets for ES12: wikidata.zip (31M triples from Wikidata), wikidata.5M.zip (Subset of wikidata.tsv with only 5M triples), Table for your results, Solution.
Lecture 13, Tuesday, January 28, 2020 (POS-Tagging, Entity Recognition, Viterbi Algorithm): Video Recording (MP4 Download), Slides, Exercise Sheet 13, transition-probabilities.tsv, emission-probabilities.tsv, Code Template for ES13, Solution.
Lecture 14, Tuesday, February 4, 2020 (Course Evaluation, Exam, Work at our Chair): Video Recording (MP4 Download), Slides, Results of the official course evaluation