Welcome to the Wiki page of the course '''Search Engines, WS 2009 / 2010'''. Lecturer: [[http://ad.informatik.uni-freiburg.de/staff/bast|Hannah Bast]]. Tutorials: [[http://ad.informatik.uni-freiburg.de/staff/celikik|Marjan Celikik]]. Course web page: [[http://ad.informatik.uni-freiburg.de/teaching/winter-term-2009-2010/suchmaschinen-vorlesung|click here]]. Here are PDFs of the slides of the lectures so far: [[attachment:SearchEnginesWS0910/lecture-1.pdf|Lecture 1]], [[attachment:SearchEnginesWS0910/lecture-2.pdf|Lecture 2]], [[attachment:SearchEnginesWS0910/lecture-3.pdf|Lecture 3]], [[attachment:SearchEnginesWS0910/lecture-4.pdf|Lecture 4]], [[attachment:SearchEnginesWS0910/lecture-5.pdf|Lecture 5]], [[attachment:SearchEnginesWS0910/lecture-6.pdf|Lecture 6]], [[attachment:SearchEnginesWS0910/lecture-7.pdf|Lecture 7]], [[attachment:SearchEnginesWS0910/lecture-8.pdf|Lecture 8]], [[attachment:SearchEnginesWS0910/lecture-9.pdf|Lecture 9]]. Here are the recordings of the lectures so far (except Lecture 2, where we had problems with the microphone), LPD = Lecturnity recording: [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-1.lpd|Recording Lecture 1 (LPD)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-3.lpd|Recording Lecture 3 (LPD)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-4.lpd|Recording Lecture 4 (LPD)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-5.lpd|Recording Lecture 5 (LPD without audio)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-6.lpd|Recording Lecture 6 (LPD)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-7.avi|Recording Lecture 7 (AVI)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-8.avi|Recording Lecture 8 (AVI)]], [[http://vulcano.informatik.uni-freiburg.de/lecturnity/lecture-9.avi|Recording Lecture 9 (AVI)]]. Here are PDFs of the exercise sheets so far: [[attachment:SearchEnginesWS0910/exercise-1.pdf|Exercise Sheet 1]], [[attachment:SearchEnginesWS0910/exercise-2.pdf|Exercise Sheet 2]], [[attachment:SearchEnginesWS0910/exercise-3.pdf|Exercise Sheet 3]], [[attachment:SearchEnginesWS0910/exercise-4.pdf|Exercise Sheet 4]], [[attachment:SearchEnginesWS0910/exercise-5.pdf|Exercise Sheet 5]], [[attachment:SearchEnginesWS0910/exercise-6.pdf|Exercise Sheet 6]], [[attachment:SearchEnginesWS0910/exercise-7.pdf|Exercise Sheet 7]], [[attachment:SearchEnginesWS0910/exercise-8.pdf|Exercise Sheet 8]], [[attachment:SearchEnginesWS0910/exercise-9.pdf|Exercise Sheet 9]]. Here are your solutions and comments on the previous exercise sheets: [[SearchEnginesWS0910/ExerciseSheet1|Solutions and Comments 1]], [[SearchEnginesWS0910/ExerciseSheet2|Solutions and Comments 2]], [[SearchEnginesWS0910/ExerciseSheet3|Solutions and Comments 3]], [[SearchEnginesWS0910/ExerciseSheet4|Solutions and Comments 4]], [[SearchEnginesWS0910/ExerciseSheet5|Solutions and Comments 5]], [[SearchEnginesWS0910/ExerciseSheet6|Solutions and Comments 6]], [[SearchEnginesWS0910/ExerciseSheet7|Solutions and Comments 7]], [[SearchEnginesWS0910/ExerciseSheet8|Solutions and Comments 8]]. The recordings of all lectures are now available, see above. Lecture 2 is missing because we had technical problems there. To play the Lecturnity recordings (.lpd files) you need the [[http://www.lecturnity.de/de/download/lecturnity-player|Lecturnity Player, which you can download here]]. I put the Camtasia recordings as .avi files, which you can play with any ordinary video player; I would recommend [[http://www.videolan.org/vlc|VLC]]. [[SearchEnginesWS0910/Rules|Here are the rules for the exercises as explained in Lecture 2]]. [[SearchEnginesWS0910/MidTermExam|Here is everything about the mid-term exam]]. == Questions and comments about Exercise Sheet 9 below this line (most recent on top) == Hi Björn + all: indeed, you don't have to do anything particularly fancy for repairing the string. When you have repaired the string up to some point i, and the next character is an invalid start of a UTF-8 sequence, you may just replace it by 0 or something like that. If your UTF-8 sequence started successfully and the first byte indicates that it's a k-byte sequence, and one of the next k-1 bytes is invalid, you can replace the whole k-byte sequence in any way you want. The one thing you are not allowed to do is change a valid sequence, those should stay as they are. I agree that this is not very difficult, but it's also not trivial. Note the point of these exercises is not so much to write a complicated program but to compare a relatively simple (but not too simple) program in three different programming languages. I do not understand though, why you think the programming task is stupid. Maybe there is still some other misunderstanding there. It's a very common application that you have a long UTF-8 string which is invalid at a few places, and you want to feed it to another application which will crash when given invalid UTF-8 sequences. You then want to make the string valid leaving the parts that were already valid intact as much as possible. If this is not clear, please ask again. '''Hannah 10Jan10 14:11''' I have a question concerning exercises 2-4: Is there any restriction saying that something indicates the length of a single UTF-8 sequence inside my large, random sequence? From how I read the description of the exercise, it seems as if it was valid to simply change each and every byte (when necessary) to be a 1-byte sequence. This way I have to change very few bits and end up with n UTF-8 sequences for ascii characters. However, this seems to be quite stupid and really easy. Maybe i missed / misread something. I would be thankful for comments. '''Björn 10.Jan 13:57''' Note that the first lecture after the winter break will be on Thursday, January 7. Apart from a new regular topic, in that lecture I will also summarize your feedback from Exercise 4 of Sheet 8, and give my comments on it + explain various new rules which hopefully suit you better. '''Hannah 19Dec09 3:50am''' We have already corrected your exams. For the summary of the results etc. follow the link above. There it will also say where you find your individual results. '''Hannah 19Dec09 3:58'''