Differences between revisions 6 and 8 (spanning 2 versions)
Revision 6 as of 2008-03-10 17:11:52
Size: 1042
Editor: mpiat1403
Comment: Token
Revision 8 as of 2008-03-10 17:40:31
Size: 1101
Editor: mpiat1403
Comment: Protected word definition
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
 * '''Analyzers''': Analyzers are components that preprocess input text at index time and/or at search time. It's important to use the same or similar analyzers that process text in a compatible manner at index and query time. For example, if an indexing analyzer lowercases words, then the query analyzer should do the same to enable finding the indexed words.  * '''Analyzer''': Analyzers are components that preprocess input text at index time and/or at search time. It's important to use the same or similar analyzers that process text in a compatible manner at index and query time. For example, if an indexing analyzer lowercases words, then the query analyzer should do the same to enable finding the indexed words.
Line 15: Line 15:
 * '''Protected word''':  * '''Protected word''': A word that is not modified by any stemming transformation.

This page gives a glossary of the most important terms in the search engine nomenclature.

A

  • Analyzer: Analyzers are components that preprocess input text at index time and/or at search time. It's important to use the same or similar analyzers that process text in a compatible manner at index and query time. For example, if an indexing analyzer lowercases words, then the query analyzer should do the same to enable finding the indexed words.

F

  • Full text search:

  • Free text:

P

  • Protected word: A word that is not modified by any stemming transformation.

S

  • Stemming: A transforming algorithm that reduces any of the forms of a word such as "runs, running, ran" to its elemental root ("run") or that does the inverse, that is, it takes a root word and expands it to all of its various form.

  • Stop word:

T

  • Token: An analyzer splits up an input text into a series of tokens. A token is a substring of the input text that is indexed or queried for and not split any further.

CompleteSearch: completesearch/Glossary (last edited 2008-09-29 15:49:39 by mpiat1403)