6833
Comment:
|
10303
|
Deletions are marked like this. | Additions are marked like this. |
Line 9: | Line 9: |
||<tablestyle="width: 828px; height: 764px;">'''Name''' ||'''Link to uploaded solution''' ||'''Link to uploaded code''' ||'''Name of collection''' ||'''#Docs in collection''' ||'''Zipf epsilon''' || | ||<tablewidth="828px" tableheight="764px">'''Name''' ||'''Link to uploaded solution''' ||'''Link to uploaded code''' ||'''Name of collection''' ||'''#Docs in collection''' ||'''Zipf epsilon''' || |
Line 14: | Line 14: |
||[[SearchEnginesWS0910/WaldemarWittmannExercises|Waldemar Wittmann]] ||[[attachment:SearchEnginesWS0910/WaldemarWittmannExercises/waldemar_wittmann_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/WaldemarWittmannExercises/waldemar_wittmann_ex1.tar.gz|TARGZ]] ||RFC Documents ||1459 ||0.085 || | ||[[SearchEnginesWS0910/WaldemarWittmannExercises|Waldemar Wittmann]] ||[[attachment:SearchEnginesWS0910/WaldemarWittmannExercises/waldemar_wittmann_ex1_update.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/WaldemarWittmannExercises/waldemar_wittmann_ex1_update2.tar.gz|TARGZ]] ||RFC Documents ||1459 ||0.08396 || |
Line 25: | Line 25: |
||[[SearchEnginesWS0910/TriatmokoExercises|Triatmoko]] ||[[attachment:SearchEnginesWS0910/TriatmokoExercises/Triatmoko_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/TriatmokoExercises/Triatmoko_ex1.rar|RAR]] ||Archives from www.textfiles.com || ca 2000 ||0.016 || | ||[[SearchEnginesWS0910/TriatmokoExercises|Triatmoko]] ||[[attachment:SearchEnginesWS0910/TriatmokoExercises/Triatmoko_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/TriatmokoExercises/Triatmoko_ex1.rar|RAR]] ||Archives from www.textfiles.com ||ca 2000 ||0.016 || |
Line 27: | Line 27: |
||[[SearchEnginesWS0910/JonasKrischExercises|Jonas Krisch]] ||[[attachment:SearchEnginesWS0910/JonasKrisch/jonas_krisch_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/JonasKrischExercises/jonas_krisch_ex1.zip|ZIP]] ||textfiles||~1500 ||0.154 || | ||[[SearchEnginesWS0910/JonasKrischExercises|Jonas Krisch]] ||[[attachment:SearchEnginesWS0910/JonasKrischExcersises/jonas_krisch_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/JonasKrischExercises/jonas_krisch_ex1.zip|ZIP]] ||textfiles ||~1500 ||0.154 || ||[[SearchEnginesWS0910/AndreBorgeatExercises|Andre Borgeat]] ||[[attachment:SearchEnginesWS0910/AndreBorgeatExercises/andre_borgeat_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/AndreBorgeatExercises/andre_borgeat_ex1.zip|ZIP]] ||Reuters-21578 ||~20000 || || ||[[SearchEnginesWS0910/JonasKoenemannExercises|Jonas Koenemann]] ||[[attachment:SearchEnginesWS0910/JonasKoenemannExercises/Jonas_Koenemann_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/JonasKoenemannExercises/Jonas_Koenemann_ex1.zip|ZIP]] ||wegt from different pages ||~1500 || || ||[[SearchEnginesWS0910/PareshParadkarExcercises|Paresh Paradkar]] ||[[attachment:SearchEnginesWS0910/PareshParadkarExcercises/Paresh_Paradkar_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/PareshParadkarExcercises/Paresh_Paradkar_ex1.zip|ZIP]] ||Selective archives from www.ibibo.org ||~1600 ||0.05966 || ||[[SearchEnginesWS0910/AlexanderSchneiderExercises|AlexanderSchneider]] ||[[attachment:SearchEnginesWS0910/AlexanderSchneiderExercises/alexander_schneider_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/AlexanderSchneiderExercises/alexander_schneider_ex1.zip|ZIP]] ||selected archives from http://textfiles.com/ ||~ 1000 ||~ 0.023 || ||[[SearchEnginesWS0910/JensSilvaSantistebanExercises|JensSilvaSantisteban]] ||[[attachment:SearchEnginesWS0910/JensSilvaSantistebanExercises/Jens_SilvaSantisteban_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/JensSilvaSantistebanExercises/Jens_SilvaSantisteban_ex1.zip|ZIP]] ||RFCs and some other files from the web ||~ 1400 ||~ 0.084 || ||[[SearchEnginesWS0910/DanielFreyExercises|Daniel Frey]] ||[[attachment:SearchEnginesWS0910/DanielFreyExercises/blatt1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/DanielFreyExercises/source.zip|ZIP]] ||archives from http://textfiles.com/ ||~ 50000 ||n.a. || ||[[SearchEnginesWS0910/JohannBetzExercises|JohannBetz]] ||[[attachment:SearchEnginesWS0910/JohannBetzExercises/johann_betz_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/JohannBetzExercises/johann_betz_ex1.zip|ZIP]] ||All textual RFCs ||5536 ||n.a. || ||[[SearchEnginesWS0910/MatthiasFrorathExercises|Matthias Frorath]] ||[[attachment:SearchEnginesWS0910/MatthiasFrorathExercises/Matthias_Frorath_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/MatthiasFrorathExercises/Matthias_Frorath_ex1.zip|ZIP]] ||Some files from textfiles.com ||~ 1300 || || ||[[SearchEnginesWS0910/IvoChichkovExercises|Ivo Chichkov]] ||[[attachment:SearchEnginesWS0910/IvoChichkovExercises/ivo_chichkov_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/IvoChichkovExercises/ivo_chichkov_ex1.zip|ZIP]] ||text converted HTML files - eNews ||~1500 ||0.154 || ||[[SearchEnginesWS0910/ManuelaOrtliebExercises|Manuela Ortlieb]] ||[[attachment:SearchEnginesWS0910/ManuelaOrtliebExercises/Manuela_Ortlieb_ex1.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/ManuelaOrtliebExercises/Manuela_Ortlieb_ex1.zip|ZIP]] ||text converted different eBooks ||2288 ||0.001 || ||[[SearchEnginesWS0910/JonasSterniskoExercises|Jonas Sternisko]] ||[[attachment:SearchEnginesWS0910/JonasSterniskoExercises/jonas_sternisko_ex01.pdf|PDF]] ||[[attachment:SearchEnginesWS0910/JonasSterniskoExercises/jonas_sternisko_ex01.tar.gz|.tgz]] ||text mined with wget from different sources ||27k+ ||0.223 || ||[[SearchEnginesWS0910/EricLacherExercises|Eric Lacher]] || [[http://data.lacher.name/blatt1.zip|blatt1]] ||[[http://data.lacher.name/InvIndex.zip|InvIndex.zip]] ||RFCs ||about 6000 ||0.912232 || |
Exercise Sheet 1
Instructions:
1. (only necessary once, before you upload something for the first time) Assume your name is Donald Duck. (0) If you haven't already done so, create a Wiki account with your name DonaldDuck (click on "Login" on the top left, then click on "you can create one now"). Always be logged in when you are about to change anything on the Wiki. (1) Type the following URL in your browser: http://ad-wiki.informatik.uni-freiburg.de/teaching/SearchEnginesWS0910/DonaldDuckExercises. (2) Click on "create new empty page" and save the empty page. (3) We will then add asap the following line to your page: #acl DonaldDuck:read,write -All:read. This will ensure that only yourself and the organizers of the course can see your solutions to the exercises, the number of points you got, etc.
2. (assuming you already have created your page http://ad-wiki.informatik.uni-freiburg.de/teaching/SearchEnginesWS0910/DonaldDuckExercises as described above) (1) Recall that your name is not Donald Duck. (2) Go to your page DonaldDuckExercises. (2) Upload your solutions there as PDF (no other formats allowed), giving your file the name donald_duck_ex1.pdf. (3) Upload your code separately as ZIP or GZIPPED TAR archive, giving your file the name donald_duck_ex1.zip or donald_duck_ex1.tgz. (4) Put the corresponding links in the table below, as well as the other information requested. Follow the pattern of the lines already there.
PLEASE UPLOAD SOLUTIONS (PDF) AND CODE (ZIP OR TGZ) SEPARATELY !
Name |
Link to uploaded solution |
Link to uploaded code |
Name of collection |
#Docs in collection |
Zipf epsilon |
RFCs and german news websites and www.textfiles.com |
5540 and 5415 and 48799 |
0.1052 and 0.0762 and 0.0762 |
|||
selected archives from www.textfiles.com |
2865 |
0.052 |
|||
Included in Code zip |
non-selected archives from www.textfiles.com |
4328 |
0.788 |
||
RFC Documents and Text Stories |
5549 and 1255 |
0.6364 and 0.5137 |
|||
RFC Documents |
1459 |
0.08396 |
|||
RFCs and selected files from www.textfiles.com |
44618 |
0.08243 |
|||
GNU Man-Pages |
5051 |
0.098 |
|||
RFC |
~5500 |
0.01299 |
|||
IRC logs |
~3800 |
0.122 |
|||
RFC's |
1460 |
0.031 |
|||
Excerpt from RFCs |
2000 |
0.0164 |
|||
RFCs 1- 2000 |
ca. 2000 |
0.06095 |
|||
some RFCs |
3100 |
0.017163 |
|||
RFCs |
5520 |
0.94 |
|||
some humor/fun files from textfiles.com |
~1000 |
0.1 |
|||
Archives from www.textfiles.com |
ca 2000 |
0.016 |
|||
html-dateien von fünf-filmfreunde.de (blog über filme..) |
~ 5000 |
~ 0.022 |
|||
textfiles |
~1500 |
0.154 |
|||
Reuters-21578 |
~20000 |
|
|||
wegt from different pages |
~1500 |
|
|||
Selective archives from www.ibibo.org |
~1600 |
0.05966 |
|||
selected archives from http://textfiles.com/ |
~ 1000 |
~ 0.023 |
|||
RFCs and some other files from the web |
~ 1400 |
~ 0.084 |
|||
archives from http://textfiles.com/ |
~ 50000 |
n.a. |
|||
All textual RFCs |
5536 |
n.a. |
|||
Some files from textfiles.com |
~ 1300 |
|
|||
text converted HTML files - eNews |
~1500 |
0.154 |
|||
text converted different eBooks |
2288 |
0.001 |
|||
text mined with wget from different sources |
27k+ |
0.223 |
|||
RFCs |
about 6000 |
0.912232 |