Size: 3464
Comment:
|
Size: 5017
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
Here are PDFs of the slides of the lectures so far: [[attachment:SearchEnginesWS0910/lecture-1.pdf|Lecture 1]], [[attachment:SearchEnginesWS0910/lecture-2.pdf|Lecture 2]]. | Here are PDFs of the slides of the lectures so far: [[attachment:SearchEnginesWS0910/lecture-1.pdf|Lecture 1]], [[attachment:SearchEnginesWS0910/lecture-2.pdf|Lecture 2]], [[attachment:SearchEnginesWS0910/lecture-3.pdf|Lecture 3]], [[attachment:SearchEnginesWS0910/lecture-4.pdf|Lecture 4]]. |
Line 5: | Line 5: |
Here are PDFs of the exercise sheets so far: [[attachment:SearchEnginesWS0910/exercise-1.pdf|Exercise Sheet 1]], [[attachment:SearchEnginesWS0910/exercise-2.pdf|Exercise Sheet 2]]. | Here are .lpd files of the recordings of the lectures so far (except Lecture 2, where we had problems with the microphone): [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-1.lpd|Lecture 1]] [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-3.lpd|Lecture 3]] [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-4.lpd|Lecture 4]]. |
Line 7: | Line 7: |
Here are your solutions and comments on the previous exercise sheets: [[SearchEnginesWS0910/ExerciseSheet1|Exercise Sheet 1]]. | Here are PDFs of the exercise sheets so far: [[attachment:SearchEnginesWS0910/exercise-1.pdf|Exercise Sheet 1]], [[attachment:SearchEnginesWS0910/exercise-2.pdf|Exercise Sheet 2]], [[attachment:SearchEnginesWS0910/exercise-3.pdf|Exercise Sheet 3]], [[attachment:SearchEnginesWS0910/exercise-4.pdf|Exercise Sheet 4]]. Here are your solutions and comments on the previous exercise sheets: [[SearchEnginesWS0910/ExerciseSheet1|Solutions and Comments 1]], [[SearchEnginesWS0910/ExerciseSheet2|Solutions and Comments 2]], [[SearchEnginesWS0910/ExerciseSheet3|Solutions and Comments 3]] = Exercise Sheet 3 = The recordings of all lectures are now available, see above. Lecture 2 is missing because we had technical problems there. To play the recordings (it's .lpd files) you need the Lecturnity Player. [[http://www.lecturnity.de/de/download/lecturnity-player|You can download the player for free here]]. |
Line 11: | Line 16: |
= Exercise Sheet 2 = Details about our three servers, as required for Exercise 4, will be given here by the end of Friday. [[SearchEnginesWS0910/ExerciseSheet2|Here you can upload your solutions for Exercise Sheet 2]]. |
[[SearchEnginesWS0910/ExerciseSheet4|Here you can upload your solutions for Exercise Sheet 4]]. |
Line 18: | Line 20: |
Hi Björn + all: For Exercise 4 from Exercise Sheet 1 you had to write code that is at least able to process 2-word queries. If your code can indeed only handle 2-word queries and not an arbitrary number of query words, that is also fine for this exercise, you won't get less points because of that. Your second question is also very valid. You should put the various functionalities into modules / classes of their own, so that you can easily combine them for the three different binaries required for Exercises 1 - 3. Each of your three programs will then be quite short, just putting together the right things. I should have added it to my list of evil coding NoNos: never ever duplicate code, but instead put it in a class / module of its own. I hope this answers your questions, if not please ask again. Sorry for the late answer, but I was super busy until now, hardly had time to breathe. '''Hannah 30Oct09 19:08''' | @Mirko: Sure, but an inverted list is a list of words where the Doc-IDs are attached to each words in which the words occur. So for Example: If word no. 5 occurs in Doc1, Doc2 and Doc3 and word no. 2 occurs in Doc5, the list would look like: 5 -> Doc1, Doc2, Doc3; 2 -> Doc5. Or am I mistaken? My question then is, how long should these attached lists be in average case? I mean, one could imagine that we got 1mil. documents over 3 words, so these lists could get very large... |
Line 20: | Line 22: |
EDIT: Oh ok. Now, I see your point. It's not an index, it's a list. Okay. So, what is an inverted list with Doc-IDs, then? | |
Line 21: | Line 24: |
I have a question concerning exercise 2. There was no concrete task to produce "query processing code" on ex sheet 1. Are there any requirements that have to be fulfilled? Should it be able to handle two word queries? n-word queries? Additionally there is something else I want to ask: I think it surely isn't bad practice to write a more generic webserver and use it for exercises 1-3. Apart from that it says "change your code" some times in the exercises. How should your submission behave w.r.t. the exercises? Different src files / executables for each exercise? One program that solve each exercises depending on startup parameters? Anything else? '''Björn 30ct09 2:25pm''' | EDIT EDIT: And to your question, Mirko, take a look at [[http://snippets.dzone.com/posts/show/93]]. Maybe this helps... '''Marius 11/14/2009 08:46pm''' |
Line 23: | Line 26: |
I now reorganized the page. Old stuff went to separate pages (links above). The idea is that the front page is always for the current lecture / exercises. The problem with your exercise page should be solved now, Ivo. '''Hannah 30ct09 00:05am''' | @ Marius: i think we are supposed to generate one inverted __list__ of size m, with doc ids from 1..n (therefore n>=m, because no duplicates?). |
Line 25: | Line 28: |
Having problems to access my exercise page after loging in - IvoChichkovExercises, '''Ivo 29Oct 22:56pm''' | Now a question from my side: ex.4, programming the compression in __java__, is there any __good__ tutorial about how to handle the bit-stuff? (otherwise, i think, it would cost me too much time..) '''Mirko 14Nov09, 19:18''' |
Line 27: | Line 30: |
Sorry to bother you. I added the Link to exercise sheet 2 with the linked pdf. Needed this to find the sheet as fast as possible. '''Marius 29Oct 10:04 p.m.''' | Hi, do you have any suggestions what the best numbers for m and n in exercise 4 should look like? Or are we supposed to mess around a bit with ints and longs? And: How long should the list of documents in the inverted index be? '''Marius 14Nov09 6:40pm''' |
Line 29: | Line 32: |
Is there a webpage for exercise sheet 2 somewhere? '''Johannes 29Oct 07:45 pm''' | And just to clarify what a single-cycle permutation is. Here is an example for an array of size 5 with a permutation that is a single cycle: 5 4 1 3 2. Why single cycle? Well, A[1] = 5, A[5] = 2, A[2] = 4, A[4] = 3, A[3] = 1. (My indices in this example are 1,...,5 and not 0,...,4.) Here is an example of a permutation with three cycles: 2 1 4 3 5. The first cycle is A[1] = 2, A[2] =1. The second cycle is A[3] = 4, A[4] = 3. The third cycle is A[5] = 5. '''Hannah 12Nov09 8:04pm''' Hi Daniel + all, I don't quite understand your question and your example (if your array is 1 5 3 4 2, why is A[1] = 3?). In case you refer to the requirement of the exercise that the permutation consists only of a single cycle. That is because your code should go over each element exactly once (it should, of course, stop after n iterations, where n is the size of the array). If your permutation has more than one cycle, it is hard to achieve that. Also note that for both (1) and (2), the sum of the array values should be sum_i=1,...,n i = n * (n+1) / 2. '''Hannah 12Nov09 7:54pm''' Hi, I just looked at the new exercise sheet 4, in exercise 1 we should generate a permutation and sum the resulting array up, am I wrong or doesn't iterating method two iterate throw the whole array in every situation. for ex.: n= 5 permutation: 1 5 3 4 2, then A[1] = 3, A[A[1]]= A[3] = 1, A[1] = 3 ... '''Daniel 12Nov09 19:44pm''' |
Welcome to the Wiki page of the course Search Engines, WS 2009 / 2010. Lecturer: Hannah Bast. Tutorials: Marjan Celikik. Course web page: click here.
Here are PDFs of the slides of the lectures so far: Lecture 1, Lecture 2, Lecture 3, Lecture 4.
Here are .lpd files of the recordings of the lectures so far (except Lecture 2, where we had problems with the microphone): Lecture 1 Lecture 3 Lecture 4.
Here are PDFs of the exercise sheets so far: Exercise Sheet 1, Exercise Sheet 2, Exercise Sheet 3, Exercise Sheet 4.
Here are your solutions and comments on the previous exercise sheets: Solutions and Comments 1, Solutions and Comments 2, Solutions and Comments 3
Exercise Sheet 3
The recordings of all lectures are now available, see above. Lecture 2 is missing because we had technical problems there. To play the recordings (it's .lpd files) you need the Lecturnity Player. You can download the player for free here.
Here are the rules for the exercises as explained in Lecture 2.
Here you can upload your solutions for Exercise Sheet 4.
Questions or comments below this line, most recent on top please
@Mirko: Sure, but an inverted list is a list of words where the Doc-IDs are attached to each words in which the words occur. So for Example: If word no. 5 occurs in Doc1, Doc2 and Doc3 and word no. 2 occurs in Doc5, the list would look like: 5 -> Doc1, Doc2, Doc3; 2 -> Doc5. Or am I mistaken? My question then is, how long should these attached lists be in average case? I mean, one could imagine that we got 1mil. documents over 3 words, so these lists could get very large...
EDIT: Oh ok. Now, I see your point. It's not an index, it's a list. Okay. So, what is an inverted list with Doc-IDs, then?
EDIT EDIT: And to your question, Mirko, take a look at http://snippets.dzone.com/posts/show/93. Maybe this helps... Marius 11/14/2009 08:46pm
@ Marius: i think we are supposed to generate one inverted list of size m, with doc ids from 1..n (therefore n>=m, because no duplicates?).
Now a question from my side: ex.4, programming the compression in java, is there any good tutorial about how to handle the bit-stuff? (otherwise, i think, it would cost me too much time..) Mirko 14Nov09, 19:18
Hi, do you have any suggestions what the best numbers for m and n in exercise 4 should look like? Or are we supposed to mess around a bit with ints and longs? And: How long should the list of documents in the inverted index be? Marius 14Nov09 6:40pm
And just to clarify what a single-cycle permutation is. Here is an example for an array of size 5 with a permutation that is a single cycle: 5 4 1 3 2. Why single cycle? Well, A[1] = 5, A[5] = 2, A[2] = 4, A[4] = 3, A[3] = 1. (My indices in this example are 1,...,5 and not 0,...,4.) Here is an example of a permutation with three cycles: 2 1 4 3 5. The first cycle is A[1] = 2, A[2] =1. The second cycle is A[3] = 4, A[4] = 3. The third cycle is A[5] = 5. Hannah 12Nov09 8:04pm
Hi Daniel + all, I don't quite understand your question and your example (if your array is 1 5 3 4 2, why is A[1] = 3?). In case you refer to the requirement of the exercise that the permutation consists only of a single cycle. That is because your code should go over each element exactly once (it should, of course, stop after n iterations, where n is the size of the array). If your permutation has more than one cycle, it is hard to achieve that. Also note that for both (1) and (2), the sum of the array values should be sum_i=1,...,n i = n * (n+1) / 2. Hannah 12Nov09 7:54pm
Hi, I just looked at the new exercise sheet 4, in exercise 1 we should generate a permutation and sum the resulting array up, am I wrong or doesn't iterating method two iterate throw the whole array in every situation. for ex.: n= 5 permutation: 1 5 3 4 2, then A[1] = 3, A[A[1]]= A[3] = 1, A[1] = 3 ... Daniel 12Nov09 19:44pm