Welcome to the Wiki of the course "Information Retrieval" in the winter term 2018/2019
Results of the official evaluation of this course
The course is given by Prof. Dr. Hannah Bast and assisted by Claudius Korzen. The course will be based on the video recordings from the course Information Retrieval WS17/18. There will be an introductory in-class lecture on Tuesday, 16th October 2018 from 2:15pm until 3:45pm in seminar room 00-010/14 in building 101. There will be 14 lectures and 13 exercise sheets altogether.
The tutors for this course are Daniel Bindemann, Patrick Brosi, Niklas Schnelle. The administrator of the supporting systems (Daphne, SVN, Forum, Jenkins) is Axel Lehmann.
Important Links
Our course management system Daphne (login with your RZ user name and password).
The forum for announcements and questions.
The manual for how to ask questions on the forum.
Our Points scheme for the exercises.
Our CodingStandards for C++, Java and Python 3.
Information about Subversion (SVN) can be found here (in german) and here (in english); about some editors (including Vim) here (in german); and about the installation of gtest (for C++ only) here (in english).
The courses from previous semesters: Information Retrieval WS 17/18, Information Retrieval WS 16/17, Information Retrieval WS 15/16, Information Retrieval WS 13/14, Information Retrieval WS 12/13.
The exams from previous semesters: WS 17/18, WS 16/17, WS 15/16, WS 13/14, WS 12/13.
Here you can download our Linux Image (for Virtualbox or VMWare).
A cheat sheet for NumPy and SciPy can be found here.
Lecture Slides, Video Recordings, Exercise Sheets, and Code
Lecture 1, Tuesday, October 16, 2018 (Introduction, Inverted Index, Zipf's Law): YouTube Part 1 YouTube Part 2 (Download Part 1 Download Part 2), Slides, Code from the lecture (+ equivalent code in Java and C++), Exercise Sheet 1, TIP file, Dataset for ES1 (189,897 movies with title + description), Solution.
Lecture 2, Tuesday, October 23, 2018 (Ranking and Evaluation): YouTube (Download), Slides, Exercise Sheet 2, TIP file, Movies Dataset (same one as for ES1), Movies Benchmark, Table for your ranking results, Solution.
Lecture 3, Tuesday, November 6, 2018 (Efficient List Intersection): YouTube (Download), Slides, Exercise Sheet 3, Postings Lists, basic code to get you started in Java and C++, Table for your intersection results, Solution.
Lecture 4, Tuesday, November 13, 2018 (Compression, Codes, Entropy): YouTube (Download), Slides, Exercise Sheet 4, Solution.
Lecture 5, Tuesday, November 20, 2018 (Fuzzy Search, Edit Distance, q-Gram Index): YouTube (Download), Slides, Exercise Sheet 5, Wikidata Entities (2,920,180 entities, name + popularity + description + additional information), TIP file, code from the lecture in Java and the analogous code in C++, Table for your results, Solution.
Lecture 6, Tuesday, November 27, 2018 (Web applications, part 1): Video Recording (Download), Slides, Exercise Sheet 6, Wikidata Entities (same as for ES5), TIP file, (part of) the code from the lecture in Java and the analogous code in C++, Solution.
Lecture 7, Tuesday, December 4, 2018 (Web applications, part 2: Javascript, Vulnerabilities, Cookies, Unicode): Video Recording (Download), Slides, Exercise Sheet 7, Wikidata Entities (same as for ES5+ES6), TIP file, Solution.
Lecture 8, Tuesday, December 11, 2018 (Vector space model): Video Recording (Download), Slides, Code from the lecture, Exercise Sheet 8, TIP file, Movies Dataset (same one as for ES2), Movies Benchmark (same one as for ES2), Table for your ranking results, Solution.
Lecture 9, Tuesday, December 18, 2018 (Clustering, k-means): Video Recording (Download), Slides, Exercise Sheet 9, Solution.
Lecture 10, Tuesday, January 8, 2019 (Latent Semantic Indexing): Video Recording (Download), Slides, Code from the lecture, Exercise Sheet 10, Solution.
Lecture 11, Tuesday, January 15, 2019 (Classification, Naive Bayes): Video Recording (Download), Slides, Exercise Sheet 11, Code for ES 11, Table for your F-measures, Datasets for ES11: Film Genres (train, Variant 1), Film Genres (test, Variant 1), Film Genres (train, Variant 2), Film Genres (test, Variant 2), Film Ratings (train, Variant 1), Film Ratings (test, Variant 1), Film Ratings (train, Variant 2), Film Ratings (test, Variant 2), Solution.
Lecture 12, Tuesday, January 22, 2019 (Knowledge Bases, SPARQL, Translation to SQL): Video Recording (Download), Slides, Exercise Sheet 12, Example files from the lecture + TIP file, Code Template for ES12, Datasets for ES12: wikidata.zip (48M triples from Wikidata), wikidata.5M.zip (Subset of wikidata.tsv with only 5M triples), Table for your results, Solution.
Lecture 13, Tuesday, January 29, 2019 (POS-Tagging, Entity Recognition, Viterbi Algorithm): Video Recording (Download), Slides, Exercise Sheet 13, transition-probabilities.tsv, word-distribution.tsv, TIP file, Solution.
Lecture 14, Tuesday, February 5, 2019 (Course Evaluation, Exam, Work at our Chair): Video Recording (Download), Slides.