Size: 8036
Comment:
|
Size: 6366
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
Here are PDFs of the slides of the lectures so far: [[attachment:SearchEnginesWS0910/lecture-1.pdf|Lecture 1]], [[attachment:SearchEnginesWS0910/lecture-2.pdf|Lecture 2]], [[attachment:SearchEnginesWS0910/lecture-3.pdf|Lecture 3]], [[attachment:SearchEnginesWS0910/lecture-4.pdf|Lecture 4]], [[attachment:SearchEnginesWS0910/lecture-5.pdf|Lecture 5]], [[attachment:SearchEnginesWS0910/lecture-6.pdf|Lecture 6]]. | Here are PDFs of the slides of the lectures so far: [[attachment:SearchEnginesWS0910/lecture-1.pdf|Lecture 1]], [[attachment:SearchEnginesWS0910/lecture-2.pdf|Lecture 2]], [[attachment:SearchEnginesWS0910/lecture-3.pdf|Lecture 3]], [[attachment:SearchEnginesWS0910/lecture-4.pdf|Lecture 4]], [[attachment:SearchEnginesWS0910/lecture-5.pdf|Lecture 5]], [[attachment:SearchEnginesWS0910/lecture-6.pdf|Lecture 6]], [[attachment:SearchEnginesWS0910/lecture-7.pdf|Lecture 7]], [[attachment:SearchEnginesWS0910/lecture-8.pdf|Lecture 8]]. |
Line 5: | Line 5: |
Here are .lpd files of the recordings of the lectures so far (except Lecture 2, where we had problems with the microphone): [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-1.lpd|Recording Lecture 1]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-3.lpd|Recording Lecture 3]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-4.lpd|Recording Lecture 4]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-5.lpd|Recording Lecture 5 (no audio)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-6.lpd|Recording Lecture 6 (with audio for a change)]]. | Here are .lpd files of the recordings of the lectures so far (except Lecture 2, where we had problems with the microphone): [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-1.lpd|Recording Lecture 1]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-3.lpd|Recording Lecture 3]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-4.lpd|Recording Lecture 4]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-5.lpd|Recording Lecture 5 (no audio)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-6.lpd|Recording Lecture 6 (with audio for a change)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-7.avi|Recording Lecture 7 (AVI)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-8.avi|Recording Lecture 8 (AVI)]]. |
Line 7: | Line 7: |
Here are PDFs of the exercise sheets so far: [[attachment:SearchEnginesWS0910/exercise-1.pdf|Exercise Sheet 1]], [[attachment:SearchEnginesWS0910/exercise-2.pdf|Exercise Sheet 2]], [[attachment:SearchEnginesWS0910/exercise-3.pdf|Exercise Sheet 3]], [[attachment:SearchEnginesWS0910/exercise-4.pdf|Exercise Sheet 4]], [[attachment:SearchEnginesWS0910/exercise-5.pdf|Exercise Sheet 5]], [[attachment:SearchEnginesWS0910/exercise-6.pdf|Exercise Sheet 6]]. | Here are PDFs of the exercise sheets so far: [[attachment:SearchEnginesWS0910/exercise-1.pdf|Exercise Sheet 1]], [[attachment:SearchEnginesWS0910/exercise-2.pdf|Exercise Sheet 2]], [[attachment:SearchEnginesWS0910/exercise-3.pdf|Exercise Sheet 3]], [[attachment:SearchEnginesWS0910/exercise-4.pdf|Exercise Sheet 4]], [[attachment:SearchEnginesWS0910/exercise-5.pdf|Exercise Sheet 5]], [[attachment:SearchEnginesWS0910/exercise-6.pdf|Exercise Sheet 6]], [[attachment:SearchEnginesWS0910/exercise-7.pdf|Exercise Sheet 7]], [[attachment:SearchEnginesWS0910/exercise-8.pdf|Exercise Sheet 8]]. |
Line 9: | Line 9: |
Here are your solutions and comments on the previous exercise sheets: [[SearchEnginesWS0910/ExerciseSheet1|Solutions and Comments 1]], [[SearchEnginesWS0910/ExerciseSheet2|Solutions and Comments 2]], [[SearchEnginesWS0910/ExerciseSheet3|Solutions and Comments 3]], [[SearchEnginesWS0910/ExerciseSheet4|Solutions and Comments 4]], [[SearchEnginesWS0910/ExerciseSheet5|Solutions and Comments 5]]. | Here are your solutions and comments on the previous exercise sheets: [[SearchEnginesWS0910/ExerciseSheet1|Solutions and Comments 1]], [[SearchEnginesWS0910/ExerciseSheet2|Solutions and Comments 2]], [[SearchEnginesWS0910/ExerciseSheet3|Solutions and Comments 3]], [[SearchEnginesWS0910/ExerciseSheet4|Solutions and Comments 4]], [[SearchEnginesWS0910/ExerciseSheet5|Solutions and Comments 5]], [[SearchEnginesWS0910/ExerciseSheet6|Solutions and Comments 6]], [[SearchEnginesWS0910/ExerciseSheet7|Solutions and Comments 7]], [[SearchEnginesWS0910/ExerciseSheet8|Solutions and Comments 8]]. |
Line 11: | Line 11: |
= Exercise Sheet 6 = The recordings of all lectures are now available, see above. Lecture 2 is missing because we had technical problems there. To play the recordings (it's .lpd files) you need the Lecturnity Player. [[http://www.lecturnity.de/de/download/lecturnity-player|You can download the player for free here]]. |
The recordings of all lectures are now available, see above. Lecture 2 is missing because we had technical problems there. To play the Lecturnity recordings (.lpd files) you need the [[http://www.lecturnity.de/de/download/lecturnity-player|Lecturnity Player, which you can download here]]. I put the Camtasia recordings as .avi files, which you can play with any ordinary video player; I would recommend [[http://www.videolan.org/vlc|VLC]]. |
Line 16: | Line 15: |
[[SearchEnginesWS0910/ExerciseSheet6|Here you can upload your solutions for Exercise Sheet 6]]. | == ABOUT THE MID-TERM (TRIAL) EXAM == Ok, guys, thanks for your votes. '''The winner is Friday, December 18, 16 - 18 h.''' The exam will start at 4.00 pm, so please be there in time. I yet have to find a room, but that shouldn't be a problem at that time of the day and week. I will inform you once I know which week it is. Here are the rules of the exam: 1. It's an open book exam, that is, you can bring and use any amount of books, papers, etc. In particular, printout of the lecture slides, exercise sheets, your solutions, etc. Also any amount of private annotations and the whole CS library if you want. You won't need much for the exam though. I think what will be most useful are the slides, so that you can look up the basic definitions of stuff, in case you forgot them, and your solutions of the exercise sheets. 2. You are not allowed to use any computing devices, mobile phones, etc. In particular, you are not allowed to communicate with others or in any way connect to the Internet or something like that. You won't need a pocket calculator. In case you need to compute log_2(10/7), we will tell you what it is. Things like 2 * 0.5 or log_2(10/5) you should be able to compute by yourself. 3. Expect one or two tasks where you have to write code for some small functions. For example, a binary search in a list of strings. (No, that will not be a task of this exam.) You can use any of the standard languages: Java, C++, C#. Python and PHP are also ok if you absolutely must. Or you can also use pseudo-code. Anyway, you will be asked to write only relatively simple functions, which do not require any involved language-specific things. You should know basic data structures like arrays, lists, and hash tables though. 4. The material covered is simply everything that we did in the lectures and in the exercises, and nothing beyond that. Note that if you haven't really understood a topic, and then comes a task about that topic, you won't have enough time to go to the slides, understand it, and then solve the task. That is, you should have a basic understanding of everything we did, before the exam. If you did all the exercises, and did the well, chances are high that you have that understanding. 4. In the trial exam there will be 5 tasks of which you have to solve 4. Given that the total time for the exam is 2 hours, this means that you will have 30 minutes for each task on average. We will not be super-strict with the time, that is, if it's 6 pm and you need a little longer that is fine. 3. This is a pure trial exam, which will not influence your final mark in any way. We will correct the exam like the real thing though and give you real marks, so that you get an impression of where you stand. 1. + 2. will most probably also hold true for the final exam. 3. obvisouly not. |
Line 19: | Line 42: |
Yes, you are right Marjan, I will just go through the ratios and invert those < 1. But one thing is for sure: a ratio of plain zero DOES NOT MAKE SENSE. So those with a plain zero, please either put the inverse, or use scientific notation. '''Hannah 1Dec09 10:17am''' I computed the ratios as it was shown in the example here on the wiki, that was dividing the values in the same order as they are mentioned (first through second). For me that meant values > 1 for exercise 1 and values < 1 for exercise 2, but I wanted to compute them in the same way. '''Florian 1Dec09 01:22am''' But if everybody computes the ratios as they want, the numbers in the table won't make sense! It also does not make sense if somebody computes the ration for Problem 1 in one way and then for Problem 2 to in another (I already don't know how the ratios are computed for the few uploaded solutions!) '''Marjan 30Nov09 00:25''' Hi Björn + all, it doesn't really matter, but I (and probably most humans) find ratios > 1 more intuitive. Just compare 8 and 0.125, which one is easier to grasp? '''Hannah 30Nov09 11:59pm''' Does it matter which way round we express the ratios? Depending on how we build the quotient, we get different values (all smaller or all greater 1). Or is that up to us? Should be possible to compare our results anyway, I assume. '''Björn 30Nov 23:36''' To Björn: You can assume you have gaps of arbitrary size. '''Marjan 30Nov 14:43''' To Claudius: The whole collection with all words. '''Marjan 30Nov 14:43''' Is there a limit on how large gaps may be in exercise 3? I'm not sure for which case the two entropies actually fulfill the equation. Gaps that "make sense" (ther sum is not larger than n-1), gaps that are at most n, or arbitrary gaps? '''Björn 30Nov09 14:31''' In Exercise 2, you ask for the costs of scanning the inv. lists of all words in the "collection". Do you mean the collection of words, matching the prefix or the the whole collection with all words in the inv. index? '''Claudius 30Nov09 2:16pm''' Hi Dragos, just three-letter prefixes are fine. I have no plans yet for future exercises with a "*" in the middle. '''Hannah 29Nov09 11:10pm''' For exercise 1, should we allow the "*" to be in any place ? Or just three letter prefix is sufficient ? I am asking because it would be good to know if we might need on later Exercise Sheets searches that allow multiple "*" in different positions, so that we do it now. '''Dragos 29 Nov 22:55''' Hi Björn, by ratio I simply mean the quotient, that is, how much bigger the one is then the other. For example, if, for a particular prefix, the total size from (1) is one million, and the size from (2) is ten thousand, then, for that prefix, the ratio between the two is one hundred. '''Hannah 29Nov09 7:48pm''' Hello, I wonder what's meant with the ratio demanded in exercise 1. If i have n lists with a maximum length of "a" and a total length of "b". Isn't the ratio something like "a:b"? At least that is what I thought. But adding a colon does not seem to be sufficient for a part of the exercise. Sorry for the meaningless question but I don't want to miss points because I'm not sure how to understand the word ratio. '''Björn 11-29 19:39''' To all: about the selection of the ten prefixes. The idea was that you pick a meaningful variety by hand, that is, such prefixes which one could imagine that one would really type them. The exact selection doesn't really matter, but do avoid extreme cases like a prefix ''yzq'' with one completion and an inverted list of three doc ids. '''Hannah 29Nov09 6:28pm''' To Marius + all: yes, I am sorry, "cost" was very imprecise here, I actually simply meant the time your code takes. '''Hannah 29Nov09 6:19pm''' So you say that you mean by "costs" the running time? Or do you understand something else when you say we have to calculate the costs? '''Marius 11/29/09 4:58pm''' Notice about Problem 2: You should use precise timers when measuring the running times. If your collection is very small and you round up your times, it's easy to get 0 ms when merging or scanning the inverted lists. I recommend using microsecond scale. '''Marjan 29Nov09 16:47''' To Florian: Yes, you can do anything you want to find those words (as long as you produce the required outputs). '''Marjan 29Nov09 16:44''' For exercise 1, should we use one of the methods presented in the lecture to find all words in the collection with the prefixes or can we do just anything to get them (though it might not be as efficient)? '''Florian 29Nov09 03:51pm''' When you scan, please make sure that you do something very simple with the elements, like summing up all doc ids, and then outputting that sum. Otherwise a clever compiler might figure out that it can remove the whole loop, because it is not producing a result that is used anywhere. '''Hannah 28Nov09 11:48pm''' To Mirko: Yes, scanning means one pass over the elements. '''Marjan 28Nov09 19:19''' Hi, about exercise2: is by "scanning" meant that one looks at every element exactly once? (=> costs of scanning a list are just the size of the list) '''Mirko 28Nov, 19:12''' |
Welcome to the Wiki page of the course Search Engines, WS 2009 / 2010. Lecturer: Hannah Bast. Tutorials: Marjan Celikik. Course web page: click here.
Here are PDFs of the slides of the lectures so far: Lecture 1, Lecture 2, Lecture 3, Lecture 4, Lecture 5, Lecture 6, Lecture 7, Lecture 8.
Here are .lpd files of the recordings of the lectures so far (except Lecture 2, where we had problems with the microphone): Recording Lecture 1, Recording Lecture 3, Recording Lecture 4, Recording Lecture 5 (no audio), Recording Lecture 6 (with audio for a change), Recording Lecture 7 (AVI), Recording Lecture 8 (AVI).
Here are PDFs of the exercise sheets so far: Exercise Sheet 1, Exercise Sheet 2, Exercise Sheet 3, Exercise Sheet 4, Exercise Sheet 5, Exercise Sheet 6, Exercise Sheet 7, Exercise Sheet 8.
Here are your solutions and comments on the previous exercise sheets: Solutions and Comments 1, Solutions and Comments 2, Solutions and Comments 3, Solutions and Comments 4, Solutions and Comments 5, Solutions and Comments 6, Solutions and Comments 7, Solutions and Comments 8.
The recordings of all lectures are now available, see above. Lecture 2 is missing because we had technical problems there. To play the Lecturnity recordings (.lpd files) you need the Lecturnity Player, which you can download here. I put the Camtasia recordings as .avi files, which you can play with any ordinary video player; I would recommend VLC.
Here are the rules for the exercises as explained in Lecture 2.
ABOUT THE MID-TERM (TRIAL) EXAM
Ok, guys, thanks for your votes. The winner is Friday, December 18, 16 - 18 h. The exam will start at 4.00 pm, so please be there in time. I yet have to find a room, but that shouldn't be a problem at that time of the day and week. I will inform you once I know which week it is.
Here are the rules of the exam:
1. It's an open book exam, that is, you can bring and use any amount of books, papers, etc. In particular, printout of the lecture slides, exercise sheets, your solutions, etc. Also any amount of private annotations and the whole CS library if you want. You won't need much for the exam though. I think what will be most useful are the slides, so that you can look up the basic definitions of stuff, in case you forgot them, and your solutions of the exercise sheets.
2. You are not allowed to use any computing devices, mobile phones, etc. In particular, you are not allowed to communicate with others or in any way connect to the Internet or something like that. You won't need a pocket calculator. In case you need to compute log_2(10/7), we will tell you what it is. Things like 2 * 0.5 or log_2(10/5) you should be able to compute by yourself.
3. Expect one or two tasks where you have to write code for some small functions. For example, a binary search in a list of strings. (No, that will not be a task of this exam.) You can use any of the standard languages: Java, C++, C#. Python and PHP are also ok if you absolutely must. Or you can also use pseudo-code. Anyway, you will be asked to write only relatively simple functions, which do not require any involved language-specific things. You should know basic data structures like arrays, lists, and hash tables though.
4. The material covered is simply everything that we did in the lectures and in the exercises, and nothing beyond that. Note that if you haven't really understood a topic, and then comes a task about that topic, you won't have enough time to go to the slides, understand it, and then solve the task. That is, you should have a basic understanding of everything we did, before the exam. If you did all the exercises, and did the well, chances are high that you have that understanding.
4. In the trial exam there will be 5 tasks of which you have to solve 4. Given that the total time for the exam is 2 hours, this means that you will have 30 minutes for each task on average. We will not be super-strict with the time, that is, if it's 6 pm and you need a little longer that is fine.
3. This is a pure trial exam, which will not influence your final mark in any way. We will correct the exam like the real thing though and give you real marks, so that you get an impression of where you stand.
1. + 2. will most probably also hold true for the final exam. 3. obvisouly not.