Differences between revisions 1 and 3 (spanning 2 versions)
Revision 1 as of 2007-08-24 20:01:32
Size: 9
Editor: infno1613
Comment:
Revision 3 as of 2007-08-27 09:04:04
Size: 479
Editor: mpiat1403
Comment: Comments about section "vocabulary"
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Kuckuck == The raw data ==

"The raw data may be anything: [...] It is the job of the parser to process the raw data and produce the following files". If the raw data really may be '''anything''', then the parser must be able to parse '''anything'''. This is quite impossible.


== <collection_name>.vocabulary ==

<word written in ASCII>: What to do with words that have non-ASCII chars in the input document?


Terms "word" and "non-word": Please give an exact definition.

The raw data

"The raw data may be anything: [...] It is the job of the parser to process the raw data and produce the following files". If the raw data really may be anything, then the parser must be able to parse anything. This is quite impossible.

<collection_name>.vocabulary

<word written in ASCII>: What to do with words that have non-ASCII chars in the input document?

Terms "word" and "non-word": Please give an exact definition.

CompleteSearch: completesearch/DocumentFormats/Discussion (last edited 2011-07-28 21:59:06 by p57B0BAF6)