See: Description
| Class | Description |
|---|---|
| Analyzer |
An Analyzer builds TokenStreams, which analyze text.
|
| CharTokenizer |
An abstract base class for simple, character-oriented tokenizers.
|
| ISOLatin1AccentFilter |
A filter that replaces accented characters in the ISO Latin 1 character set
(ISO-8859-1) by their unaccented equivalent.
|
| KeywordAnalyzer |
"Tokenizes" the entire stream as a single token.
|
| KeywordTokenizer |
Emits the entire input as a single token.
|
| LengthFilter |
Removes words that are too long and too short from the stream.
|
| LetterTokenizer |
A LetterTokenizer is a tokenizer that divides text at non-letters.
|
| LowerCaseFilter |
Normalizes token text to lower case.
|
| LowerCaseTokenizer |
LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together.
|
| PerFieldAnalyzerWrapper |
This analyzer is used to facilitate scenarios where different
fields require different analysis techniques.
|
| PorterStemFilter |
Transforms the token stream as per the Porter stemming algorithm.
|
| SimpleAnalyzer |
An Analyzer that filters LetterTokenizer with LowerCaseFilter.
|
| StopAnalyzer |
Filters LetterTokenizer with LowerCaseFilter and StopFilter.
|
| StopFilter |
Removes stop words from a token stream.
|
| Token |
A Token is an occurence of a term from the text of a field.
|
| TokenFilter |
A TokenFilter is a TokenStream whose input is another token stream.
|
| Tokenizer |
A Tokenizer is a TokenStream whose input is a Reader.
|
| TokenStream |
A TokenStream enumerates the sequence of tokens, either from
fields of a document or from query text.
|
| WhitespaceAnalyzer |
An Analyzer that uses WhitespaceTokenizer.
|
| WhitespaceTokenizer |
A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
|
| WordlistLoader |
Loader for text files that represent a list of stopwords.
|
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.