AD Teaching Wiki:

Exercise Sheet 3

Instructions for uploading are the same as for Exercise Sheet 1. If you forgot them, you can read them again here.

Your solutions (files can only be read by the uploader and by us)

No.

Name

Solution (PDF)

Code (ZIP or TGZ)

1

Mirko Brodesser

PDF

ZIP

2

Björn Buchhold

PDF

ZIP

3

Johann Latocha

PDF

TGZ

4

Johannes Stork

PDF

TGZ

5

Marius Greitschus

PDF

---

6

Triatmoko

PDF

Rar

5

Jonas Krisch

PDF

---

7

Eric lacher

PDF

ZIP

8

Manuela Ortlieb

PDF

ZIP

9

Richard Zahoransky

PDF

---

10

Florian Bäurle

PDF

ZIP

11

Zhongjie Cai

PDF

ZIP

12

Paresh Paradkar

PDF

ZIP

13

Shou-Yu Chao

PDF

ZIP

14

Jens Silva Santisteban

PDF

ZIP

15

Alexander Nutz

PDF

ZIP

16

Claudius Korzen

PDF

ZIP

17

Achille Nana

N.A

ZIP

18

Daniel Frey

PDF

TGZ

19

Matthias Frorath

PDF

ZIP

20

Markus Gruetzner

PDF

ZIP

21

Dragos Sorescu

PDF

ZIP

22

Jonas Koenemann

PDF

ZIP

23

Thomas Liebetraut

PDF

TGZ

24

Alexander Schneider

PDF

ZIP

25

Daniel Schauenberg

PDF

tar.gz

26

Alexander Gutjahr

PDF

tar.gz

27

Waleed butt

PDF

ZIP

28

Waldemar Wittmann

PDF

TGZ

29

Jonas Sternisko

PDF

tar.gz

30

Matthias Sauer

PDF

zip

31

Björn Geiger

PDF

zip

These were the questions and comments on Exercise Sheet 4

Hi Paresh + all. If you do and-ish retrieval, all documents which contain at least one of the query words are relevant. If you do and retrieval, only the documents which contain all of the query words are relevant. Independent of which scoring scheme you use, you can do it either way. Just clearly state which way you are doing it, and things will be fine. Hannah 10Nov09 2:46am

Hi, If i use and-ish retrieval how is the relevance of hits is measured? The hits which have one of the keywords are relevant or only those who have all the keywords are relevant. I am confused!! Paresh 10 Nov 1:31AM

Hi Dragos + all, pick three documents which got a score of zero. For any reasonable query and a not too small collection there should be such documents. Hannah 10Nov09 0:24am

Hello :). For exercise 3, should we find three-relevant documents that were not returned at all, or that were low ranked ? I guess that the goal is to discuss why the low-rank, right ? :) Dragos 9Nov 11.56 PM

Hi Dragos, if you want you can use your implementation from Exercise 1 (extended to deal with scores, of course). If your implementation could only deal with two-word queries, you have to find a suitable two-word query (see what I wrote below), but that should be possible. I don't understand what this has to do with the vector-space model though? The vector space model is just a way to get a formula for ranking documents. Processing the queries is still done via inverted lists. Understand that inverted lists are just an efficient way to store the non-zero entries of a sparse matrix. You would never store the term-document matrix as a two-dimensional array, that would be huge ... and a big waste, because most entries are actually zero (that's what "sparse" means). Hannah 9Nov09 2:31am

For exercises 2 and 3, should the query be multi-word, or the exercises refer to the two-word query implemented in Ex Sheet 1 ? I am asking because it's clearly more simple to use the "easy" method(presented in the lecture) for two-word queries and the "vector space model" for a multi-word query. Thank you in advance ! Dragos 9Nov 00.27

To Björn + all: I said non-trivial so that you don't take a super-specific query, which has only one or two matching documents, which you can easily retrieve with 100% precision and 100% recall with the right query. An example would be an article with a very specific title like "On the influence of Blancmanges from Skyron on Scottish tennis playing skills". Then, if your collection is not super large, the query "blancmanges skyron scottish" will be perfect. Don't pick a query like that. Also do not just formulate a query, but also write down the search request that you had in mind, so that you have a yardstick to determine what is relevant for that query and what is not. Here are some example of queries from TREC, the big IR benchmark conference. Each query there has a so-called "title" (what you would typically enter as query words), a "description" (a short description of what the query is about), and a "narrative" (a long description of what the query is about). Hannah 08Nov09 3:28pm

Concerning the exercises: What is a non-trivial query? A query that does contain multiple documents or does it have to consist of multiple words, too? Anything else? Björn 8Nov09 14:26

The recording works for mac os with flip4mac: http://www.microsoft.com/windows/windowsmedia/player/wmcomponents.mspx Jonas 8Nov09 14:06

Maybe you could consider putting it on electures - it seems to be the standard place for putting recordings and slides online as far as I know and there are already some solutions for putting lecturnity files online in a platform-independent manner. I don't know if it's practical but it's also nice having all lectures in one place i'd say... http://electures.informatik.uni-freiburg.de/ Alex 7Nov09 18:05

In Linux I see no suitable plugin. I would like to download the .lpd file too. We can test it with our old Lecturnity versions (i have 2.0) and if it doesn't work we can download 4.x at http://www.lecturnity.de/de/download/lecturnity-player/ Waldemar 7Nov09 12:49

Thanks, Paresh, yes I can do that. So do all students have access to the latest version (should be at least 4.x otherwise it will not work I think) of Lecturnity? Hannah 6Nov09 11:35pm

Hi, yes the recording is working properly after downloading the plug-in. kindly upload the rest files. Also it will be helpful if you could give links to .lpd files since it is easier to download and play them in lecturnity player than browser and one can play them at any time. Paresh 6 Nov09 11:25pm

To Mirko + all: whenever we write "prove", we mean a proof in the mathematical sense. For the exercises, the challenge is often two-fold. You first have to turn the statement of the exercise into a formal statement. Then you have to prove that statement. For Exercise 4 you will first have to specify the order in which the inverted lists should be sorted. Then you have to prove that the document with the i-th largest score (formed by max aggregation), where i <= k, is indeed among one of the k first entries wrt to the specified order, in at least one of the inverted lists. Hannah 3Nov09 10:29pm

About Exercise4: I actually dont know how to to write down (but i think i know how/why it works) the prove of top-k retrieval with the maximum-score. Is it okay to describe it in words or do we have to formalize it in a certain way? Mirko 5Nov09 22:21pm

Ok, I have played around a bit with lecturnity myself, and published Lecture 3, see the link above. For Marjan it worked, he only needed to install some Windows Media plugin for his Firefox. Please also try, and tell me if there are problems. Also tell me if everything goes fine. (It's enough if one or two people tell me.) If it does I will also publish Lecture 1. Lecture 2, as I said, is lost to the world forever (well, at least the audio), since audio recording did not work that day. Hannah 3Nov09 10:06pm

Dear Marius + all: Yes, the lectures are recorded, except for Lecture 2, where there were technical problems (no signal from the microphone). I always copy the Lecturnity files to my machine after the lecture, but don't know yet how how to publish them on the web so that they are easily viewable by others. I will meet with our group's technician tomorrow, and ask him about this. Stay tuned! Hannah 5Nov09 8:36pm

Hi, I noticed that you record your lectures. Is it somehow possible to download these recordings or will they be released later? Marius Nov5th, 4:54 p.m.

Hi Waleed, when you create a conflict, it's your responsibility to remove it and not leave a mess behind. If the instructions given when the conflict occurs do not suffice, try to find more information on the Wiki help pages. Hannah 3Nov09 9:00pm

I uploaded my Files and put a new row on table in the excercies sheet 2 page but when i pressed save button it shows me conflict. my version and other version of list. how can i remove conflict? does my assignment is submitted properly or not? Waleed 3Nov09

AD Teaching Wiki: SearchEnginesWS0910/ExerciseSheet3 (last edited 2009-11-12 20:29:16 by Hannah Bast)