AD Teaching Wiki
  • Comments
  • Immutable Page
  • Menu
    • Navigation
    • RecentChanges
    • FindPage
    • Local Site Map
    • Help
    • HelpContents
    • HelpOnMoinWikiSyntax
    • Display
    • Attachments
    • Info
    • Raw Text
    • Print View
    • Edit
    • Load
    • Save
  • Login

FrontPage

Upload page content

You can upload content for the page named below. If you change the page name, you can also upload content for another page. If the page name is empty, we derive the page name from the file name.

File to load page content from
Page name
Comment

Revision 7 as of 2013-03-01 18:59:13
AD Teaching Wiki:
  • InformationRetrievalWS1213
  • ExamSolutions

TASK 1 (Inverted Index)

1.1 Inverted lists with tf-scores, gap-encoded

bla: 1 tf2, 2 tf2, 4 tf2   ->  +1 tf2, +1 tf2, +2 tf2
bli: 2 tf1, 5 tf1          ->  +2 tf1, +3 tf1
blu: 3 tf2, 4 tf1, 5 tf2   ->  +3 tf2, +1 tf1, +1 tf2

1.3 With entropy-optimal encoding

+1 -> 0
+2 -> 10
+3 -> 11

tf1 -> 0
tf2 -> 1

bla: 01 01 101
bli: 100 110
blu: 111 00 01

TASK 2 (List Intersection)

2.1 Code for intersecting two lists represented as bit vectors:

Array<bool> intersect(Array<bool> list1, Array<bool> list2) {
  assert(list1.size() = list2.size());
  Array<bool> result;  
  for (int i = 0; i < list1.size()) { result.push(list1[i] & list2[i]); }
  return result;
}

CORRECTION NOTE: It can be assumed that they have the same length; if someone does special magic for different lengths that is also fine, but it's not necessary.

2.2 Comparison with gap-encoded representation.

The entropy of the given probability distribution is H(X) = sum_i i / 2^i = 2. Hence, for an entropy-optimal code, we need 2m bits for a list of length m. The bit vector is hence better if and only if m > n/2.

TASK 3 (List Intersection)

3.1 2-gram index and lower bound

W1: #bla#
W2: #bli#
W3: #blu#

#b: W1, W2, W3
bl: W1, W2, W3
la: W1
li: W2
lu: W3
a#: W1
i#: W2
u#: W3

ED(x, y) <= 1  =>  #common 2-grams of x' and y' must be >= comm := max(|x|, |y|) - 1

3.2 Find all words with ED <= 1 to blaa

x = blaa
x' = #blaa#
#b: W1, W2, W3
bl: W1, W2, W3
la: W1
aa:
a#: W1

2-grams in common with blaa:
W1: 4
W2: 2
W3: 2

comm >= 3, so only need to compute edit distance to W1, which is 1. Hence bla is the only word with ED <= 1.

TASK 4 (Latent Semantic Indexing)

4.1 Rank + decomposition

For example, the first three column vectors are linearly independent -> rank 3.

CORRECTION NOTE: no need to prove the linear independence (but the vectors selected should indeed be linear independent). Easy proof for linear independence: compute the determinant, e.g. of the first three column vectors. If it is non-zero, they are linearly independent.

Obtain A' by changing the entry at row 2, column 2 from 0 to 1.

       [1 1 0 1 0.5]     [1 0]     [1 1 0 1 0.5]
A'  =  [1 1 1 2  1 ]  =  [1 1]  *  [0 0 1 1 0.5]
       [0 0 1 1 0.5]     [0 1]

4.2 Ranking for query Q = (1, 1, 0), |Q| = sqrt(2).

Scores with division by |Q|:  1, 1, 1/2, sqrt(3)/2, sqrt(3)/2.
Scores without division by |Q|:  sqrt(2), sqrt(2), sqrt(1/2), sqrt(3/2), sqrt(3/2).
The ranking is hence D#1, D#2, D#4, D#5, D#3.

TASK 5 (k-means)

5.1 Clustering of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 with 1 and 2 as initial centroids.

Step 1:  1 | 2 3 4 5 6 7 8 9 10,  centroids 1  6
Step 2:  1 2 3 | 4 5 6 7 8 9 10,  centroids 2  7
Step 3:  1 2 3 4 | 5 6 7 8 9 10,  centroids 2.5  7.5
Step 4:  1 2 3 4 5 | 6 7 8 9 10,  centroids 3  8
No more change!

5.2 Same clustering for any choice of initial centroids.

Observation 1: Clustering is always in two contiguous parts. Assume not yet half-half. Then the left-most element of the right part will be assigned to left cluster (or vice versa).

TASK 6 (Naive Bayes)

6.1 Computing the priors and classifying xy and xxxy

Pr(x|X) = 2/3, Pr(y|X) = 1/3, Pr(x|Y) = 2/3, Pr(y|Y) = 1/3, Pr(X) = 2/3, Pr(Y) = 1/3
Pr(X|xy) / Pr(Y|xy) = Pr(X) / Pr(Y) = 2, and the same for Pr(X|xxy) / Pr(Y|xxxy).

Hence both xy and xxxy are classified as X

6.2 Generalization + draw points and band of SVM corresponding classifier.

For any w = (nx, ny):
Pr(X|w) = Pr(X) * Pr(x|X)^nx * Pr(y|X)^ny = 2/3 * (2/3)^nx * (1/3)^ny
Pr(Y|w) = Pr(Y) * Pr(x|Y)^nx * Pr(y|Y)^ny = 1/3 * (2/3)^nx * (1/3)^ny
Hence always Pr(X|w) / Pr(Y|w) = 2 => any w will be classified as X.

2   ---X--------X-------X-------X--- 

1   ---Y--------Y-------------------

    0  1    2   3   4   5   6   7

The SVM band is just the two horizontal lines.

  • MoinMoin Powered
  • Python Powered
  • GPL licensed
  • Valid HTML 4.01