== CompleteSearch Completion Server == {{{ startCompletionServer [options] .hybrid }}} This starts the [[CompletionServer]]. There are several options available, which should provide most of the necessary functionality. ==== Explicit Server Options ==== * '''''--zero-fork''''' Run the server in the '''foreground''', and output everything to the console, which is convenient for testing. The default mode is to run as a background process and write all output to a log file. * '''''--no-double-fork''''' Single fork, process will run forever or until server killed. * '''''--multi-threaded''''' Run in multithreaded mode (default: process one query after the other; still recommended). * '''''--auto-restart''''' Automatically restart the server, if it crashes (requires double fork mode, which is default). * '''''--kill''' '' Stop the server running at the specified port. * '''''--kill-running-server''''' If there is a running server, do kill it before starting the new one. * '''''--port''' '' Specify the port, where the server is listening (default is 8888). * '''''--pid-file''' '' Specifiy name of file containing the process id. Leading ~ will be replaced by the home directory, first %s will be replaced by host name, second %s will be replaced by port (default is ~/.completesearch__). * '''''--locale''' '' Set LC_ALL to this string, irrespective of special "!encoding:..." word in index. * '''''--maps-directory''' '' Specify the directory containing the maps ''utf8.map'' and ''iso8859-1.map'' (default is the execution directory). * '''''--index-type''' [INV|HYB]'' Type of index (default: guess from index file name). * '''''-e''''' Name of file containing excerpts info (default: .docs.DB). ==== Query Processing Options ==== * '''''--normalize-words''''' Normalize all non-facet words. This allows to find ''Müller'', even if ''muller'' is requested. It's recommended to also set the option ''--use-suffix-for-exact-query''. Take care, to achieve the intended behaviour, it's necessary to parse with the same option. See also [[CsvParser]]. * '''''--word-part-separator-backend''' '' We used to separate the words in special words like '':facet:year:*'' by using the colon. We noticed, that the colon is positioned between numbers and letters in the ascii code, which might lead to problems on reading word ranges from the words file. There should not occur any problems, but it's still recommended to use a character which is positioned in front of numbers, like '!' (the default now). It's necessary to build the words file with the same delimuter. See also [[CsvParser]]. * '''''--query-timeout''' '' Specify a timeout a request is allowed to be processed to prevent critical queries from bringing the server to a standstill (default is 5000 ms). * '''''--word-part-separator-frontend''' '' Specify the separator, which is used in the api to request special queries like :facet:year:1993 (default is ':'). * '''''--use-suffix-for-exact-query''''' Allows to find ''müller'', if normalization is enabled. Otherwise it's necessary to look for ''müller:*'', instead of ''müller''. * '''''--disable-cdata-tags''''' It's recommended to use this option, if the info field for each document is valid xml and if invalid xml is already escaped using cdata. Otherwise your whole output will be escaped by using cdata. * '''''-E''''' On error the error message is appended to the response and sent to the client. * '''''--document-root''' '' Allows to request e.g. HTML pages located under the given path by requesting '':/''. Per default this feature is disabled. * '''''--exe-command''' '' If specified, the usage of the query parameter ''exe='' leads to the execution of the command . * Cache/history sizes must be greater than 0 and are given in one of the form: ''n'' meaning ''n bytes'', ''nK'' meaning ''n kilobytes'', ''nM'' meaning ''n megabytes'', ''nG'' meaning ''n gigabytes''. * '''''--max-size-history''' '' Set the history size (default: 32 megabytes). * '''''--max-queries-history''' '' At most that many queries in history (default: 200; note: current impl. is quadratic). * '''''--cache-size-excerpts''' '' Sets the cache size for the excerpts generator (default: 16 megabytes). * '''''--cleanup-query-before-processing''''' Cleanup query before processing by correcting the order of the letters ^, * and ~ and erasing multiple interpretable characters like #, . and *. * '''''--how-to-rank-docs''' '' Specify how to rank documents (0 = by score, 1 = by doc id, 2 = by word id followed by a = ascending or d = descending, default os 0d). * '''''--how-to-rank-words''' '' Specify how to rank words (0 = by score, 1 = by doc count, 2 = by occ count, 3 = by word id, followed by a = ascending or d = descending, default is 0d). * '''''--score-aggregations''' '' Specify score aggregation by a 4-letter string over the alphabet {S,M,B}, see explanations below. * There are currently three types of score aggregation, S = sum, M = max, B = sum with bonus for proximity and exact word match. There are two aggregations for doc scores (same completion, different completion) and two aggregations for word scores (same doc, different doc). ==== Logging Options ==== * '''''--log-file''' '' Specify file name for the log messages (default is .log). * '''''--show-query-result''''' Log information about the query result. * '''''--verbosity''' '' Set the log verbosity, especially for debugging (1 = normal, 2 = high, 3 = highest; default is 1). * '''''--no-statistics''''' Don't write time statistics to the log file. {{{#!wiki comment * It's possible provide different outputs (info fields) for one document by using '''''--info-delimiter''' ''. This can be reasonable, if you want to return different columns (e.g. and ) in different situations. It's possible to request the different outputs by using the query parameter ''p='', whereas pos defines if it's the first of the given outputs (p=0), the second outputs (p=1), etc. }}} Existing options, which are not yet explained in depth, but copied from the source code. * '''''--use-generalized-edit-distance-slow''''' Use generalized edit distance to rank the word-ids (slow!). * '''''--read-custom-scores''''' (-0) * To enable synonym search, use '''''enable-synonym-search'''''. * To enable fuzzy search, use '''''enable-fuzzy-search'''''. This allows to find e.g. ''algorithm'' even by requesting the wrong written ''algoritm~'' (the tilde is essential). * '''''--fuzzy-normalize-words''''' (-W) * -'''''-use-baseline-fuzzysearch''''' (-B) For more details, a look at the code that processes these command line options might be helpful. You can find the code in file [[https://ad-svn.informatik.uni-freiburg.de/completesearch/codebase/server/StartCompletionServer.cpp]].