Differences between revisions 5 and 6
Revision 5 as of 2008-03-10 17:03:04
Size: 852
Editor: mpiat1403
Comment: Analyzers
Revision 6 as of 2008-03-10 17:11:52
Size: 1042
Editor: mpiat1403
Comment: Token
Deletions are marked like this. Additions are marked like this.
Line 22: Line 22:

'''T'''

 * '''Token''': An analyzer splits up an input text into a series of tokens. A token is a substring of the input text that is indexed or queried for and not split any further.

This page gives a glossary of the most important terms in the search engine nomenclature.

A

  • Analyzers: Analyzers are components that preprocess input text at index time and/or at search time. It's important to use the same or similar analyzers that process text in a compatible manner at index and query time. For example, if an indexing analyzer lowercases words, then the query analyzer should do the same to enable finding the indexed words.

F

  • Full text search:

  • Free text:

P

  • Protected word:

S

  • Stemming: A transforming algorithm that reduces any of the forms of a word such as "runs, running, ran" to its elemental root ("run") or that does the inverse, that is, it takes a root word and expands it to all of its various form.

  • Stop word:

T

  • Token: An analyzer splits up an input text into a series of tokens. A token is a substring of the input text that is indexed or queried for and not split any further.

CompleteSearch: completesearch/Glossary (last edited 2008-09-29 15:49:39 by mpiat1403)