completesearch/SourceCodeOverview

Starting the server (in main file StartCompletionServer.cpp)

Parse command line arguments
Double fork (calling process exits, grandchild runs the actual server, child watches grandchild)
Create CompletionServer object
Call method waitForRequestsAndProcess

Main server loop (in class CompletionServer)

Read vocabulary, open index file, create complete object (in constructor)
Create socket (in method CompletionServer::createSocket)
Wait for requests (in method CompletionServer::waitForRequestsAndProcess)

Processing a request (in class CompletionServer)

Create own thread (in method CompletionServer::processRequest)
Create completer object for this thread (in method CompletionServer::processRequestThreadFunction)
Note: The CompletionServer object has one index and one history, and each newly created Completer object (where Completer is typically HYBCompleter) is passed these two by reference (not by copying!!).
Read query string (in method CompletionServer::processRequestThreadFunction)
Process query (via method Completer::processQuery; see below)
Note: The output is an object of type QueryResult, which contains the raw lists of result postings, as well as the lists of the top-ranked matching words and documents. TODO: right now the history item is created while processing the query. But it would also be fine (better?) to copy the result to the history after the result has been sent; see below.
Get excerpts (via method CompletionServer::getExcerptsAndBuildResultString)
TODO: This method should really be part of class ExcerptsGenerator, not of class CompletionServer
Send result to requesting client
Close connection and finish thread

Processing a query (in class HYBCompleter / CompleterBase)

Method called from outside is CompleterBase::processQuery
In the simplest case takes two parameters, a query (of type Query) and a reference to a result pointer QueryResult*. Additional parameters specify the query parameters. TODO: Simply pass a QueryParameter object as third argument, with an intuitive default initialization.
Call CompleterBase::processQueryRecursively The only thing processQuery does is rewrite the separator symbols inside of join blocks (of the form [...#...#...]), deal with exceptions, and do some timing. Otherwise, processQueryRecursively takes exactly the same parameters as processQuery. processQueryRecursively recursively processes the query from left to right, i.e., it splits the query on the last separator, processes the first part (possibly empty = conceptually the result is the set of all doc ids), and intersects that result with the list for the part after the last separator (which can be a single prefix or a join block or an or block).
There are three main cases: (1) the first part is in the history, but not yet finished => wait a little; (2) the first part is in the history, and finished => use it; (3) the first part is not in the history => try filtering call processQueryRecursively on the first part. TODO: if in (1) the history entry is still unfinished after some waiting, the query currently fails.
Try to filter (instead of recursing) In the normal case (3) from above, it is first tried whether the query can be filtered from a query that is already in the history (= has been computed before). Three types of filtering are tried out: normal filtering (filter result for "scheduling algorithm*" from result for "scheduling algo*"), advanced filtering (filter result for schedul* venue:* from result for sched* venue:*, using an additional intersection), and advanced filtering II (filter result for "approx*..algo* venue:*" from result for "approx venue:*", using an additional intersection).
Call CompleterBase::allMatchesForLastPartAndCandidates; TODO: give a better name
In the normal case, this computes the word range pertaining to the last part of the query, and then calls allMatchesForWordRangeAndCandidates, which is the central function processing prefix completion queries of the respective subclass of CompleterBase, typically HYBCompleter.
There is special treatment of the cases, where the last part is a join query ([...#...#...]) or a disjunctive query (...|...|...). These cases are solved recursively. For example, the query xyz q1|q2|q3 is solved by recursively computing the result for xyz q1|q2, and then the result for xyz q3, and then computing the union (not intersection) of the two. And similarly for the join.
Call HYBCompleter/INVCompleter::allMatchesForWordRangeAndCandidates; TODO: give better name This is the central function for processing prefix completion queries. For HYBCompleter, this is divided into 2 case. Case 1 is when the result for the last part was in the history (the result is passed as an argument to the method in that case, in the other case that argument is NULL). Case 2 is when the last part requires reading one or several blocks from disk.
TODO: Looks odd to have the history case dealt with here, while all other history, filtering, etc. cases are dealt with in CompleterBase::processQueryRecursively.

CompleteSearch: completesearch/SourceCodeOverview (last edited 2008-04-04 03:44:01 by infno1613)