Package | Description |
---|---|
org.apache.lucene.analysis |
API and code to convert text into indexable/searchable tokens.
|
org.apache.lucene.analysis.br |
Analyzer for Brazilian.
|
org.apache.lucene.analysis.cjk |
Analyzer for Chinese, Japanese and Korean.
|
org.apache.lucene.analysis.cn |
Analyzer for Chinese.
|
org.apache.lucene.analysis.compound |
A filter that decomposes compound words you find in many Germanic
languages to the word parts.
|
org.apache.lucene.analysis.de |
Analyzer for German.
|
org.apache.lucene.analysis.el |
Analyzer for Greek.
|
org.apache.lucene.analysis.fr |
Analyzer for French.
|
org.apache.lucene.analysis.miscellaneous | |
org.apache.lucene.analysis.ngram | |
org.apache.lucene.analysis.nl |
Analyzer for Dutch.
|
org.apache.lucene.analysis.payloads |
Provides various convenience classes for creating payloads on Tokens.
|
org.apache.lucene.analysis.ru |
Analyzer for Russian.
|
org.apache.lucene.analysis.shingle | |
org.apache.lucene.analysis.sinks |
Implementations of the SinkTokenizer that might be useful.
|
org.apache.lucene.analysis.snowball |
TokenFilter and Analyzer implementations that use Snowball
stemmers. |
org.apache.lucene.analysis.standard |
A fast grammar-based tokenizer constructed with JFlex.
|
org.apache.lucene.analysis.th | |
org.apache.lucene.index.memory |
High-performance single-document main memory Apache Lucene fulltext search index.
|
org.apache.lucene.search.highlight |
The highlight package contains classes to provide "keyword in context" features
typically used to highlight search terms in the text of results pages.
|
org.apache.lucene.wikipedia.analysis |
Modifier and Type | Method and Description |
---|---|
Token |
Token.clone(char[] newTermBuffer,
int newTermOffset,
int newTermLength,
int newStartOffset,
int newEndOffset)
Makes a clone, but replaces the term buffer &
start/end offset in the process.
|
Token |
TokenStream.next()
Deprecated.
The returned Token is a "full private copy" (not
re-used across calls to next()) but will be slower
than calling
TokenStream.next(Token) instead.. |
Token |
TokenStream.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
ISOLatin1AccentFilter.next(Token reusableToken) |
Token |
TeeTokenFilter.next(Token reusableToken) |
Token |
SinkTokenizer.next(Token reusableToken)
Returns the next token out of the list of cached tokens
|
Token |
LowerCaseFilter.next(Token reusableToken) |
Token |
CachingTokenFilter.next(Token reusableToken) |
Token |
LengthFilter.next(Token reusableToken)
Returns the next input Token whose term() is the right len
|
Token |
CharTokenizer.next(Token reusableToken) |
Token |
KeywordTokenizer.next(Token reusableToken) |
Token |
StopFilter.next(Token reusableToken)
Returns the next input Token whose term() is not a stop word.
|
Token |
PorterStemFilter.next(Token reusableToken) |
Token |
Token.reinit(char[] newTermBuffer,
int newTermOffset,
int newTermLength,
int newStartOffset,
int newEndOffset)
Shorthand for calling
clear() ,
setTermBuffer(char[], int, int) ,
setStartOffset(int) ,
setEndOffset(int)
setType(java.lang.String) on Token.DEFAULT_TYPE |
Token |
Token.reinit(char[] newTermBuffer,
int newTermOffset,
int newTermLength,
int newStartOffset,
int newEndOffset,
String newType)
Shorthand for calling
clear() ,
setTermBuffer(char[], int, int) ,
setStartOffset(int) ,
setEndOffset(int) ,
setType(java.lang.String) |
Token |
Token.reinit(String newTerm,
int newStartOffset,
int newEndOffset)
Shorthand for calling
clear() ,
setTermBuffer(String) ,
setStartOffset(int) ,
setEndOffset(int)
setType(java.lang.String) on Token.DEFAULT_TYPE |
Token |
Token.reinit(String newTerm,
int newTermOffset,
int newTermLength,
int newStartOffset,
int newEndOffset)
Shorthand for calling
clear() ,
setTermBuffer(String, int, int) ,
setStartOffset(int) ,
setEndOffset(int)
setType(java.lang.String) on Token.DEFAULT_TYPE |
Token |
Token.reinit(String newTerm,
int newTermOffset,
int newTermLength,
int newStartOffset,
int newEndOffset,
String newType)
Shorthand for calling
clear() ,
setTermBuffer(String, int, int) ,
setStartOffset(int) ,
setEndOffset(int)
setType(java.lang.String) |
Token |
Token.reinit(String newTerm,
int newStartOffset,
int newEndOffset,
String newType)
Shorthand for calling
clear() ,
setTermBuffer(String) ,
setStartOffset(int) ,
setEndOffset(int)
setType(java.lang.String) |
Modifier and Type | Method and Description |
---|---|
void |
SinkTokenizer.add(Token t)
Override this method to cache only certain tokens, or new tokens based
on the old tokens.
|
Token |
TokenStream.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
ISOLatin1AccentFilter.next(Token reusableToken) |
Token |
TeeTokenFilter.next(Token reusableToken) |
Token |
SinkTokenizer.next(Token reusableToken)
Returns the next token out of the list of cached tokens
|
Token |
LowerCaseFilter.next(Token reusableToken) |
Token |
CachingTokenFilter.next(Token reusableToken) |
Token |
LengthFilter.next(Token reusableToken)
Returns the next input Token whose term() is the right len
|
Token |
CharTokenizer.next(Token reusableToken) |
Token |
KeywordTokenizer.next(Token reusableToken) |
Token |
StopFilter.next(Token reusableToken)
Returns the next input Token whose term() is not a stop word.
|
Token |
PorterStemFilter.next(Token reusableToken) |
void |
Token.reinit(Token prototype)
Copy the prototype token's fields into this one.
|
void |
Token.reinit(Token prototype,
char[] newTermBuffer,
int offset,
int length)
Copy the prototype token's fields into this one, with a different term.
|
void |
Token.reinit(Token prototype,
String newTerm)
Copy the prototype token's fields into this one, with a different term.
|
Modifier and Type | Method and Description |
---|---|
Token |
BrazilianStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
BrazilianStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
CJKTokenizer.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Modifier and Type | Method and Description |
---|---|
Token |
CJKTokenizer.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Modifier and Type | Method and Description |
---|---|
Token |
ChineseTokenizer.next(Token reusableToken) |
Token |
ChineseFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
ChineseTokenizer.next(Token reusableToken) |
Token |
ChineseFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
protected Token |
CompoundWordTokenFilterBase.createToken(int offset,
int length,
Token prototype) |
Token |
CompoundWordTokenFilterBase.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
protected Token |
CompoundWordTokenFilterBase.createToken(int offset,
int length,
Token prototype) |
protected void |
CompoundWordTokenFilterBase.decompose(Token token) |
protected void |
HyphenationCompoundWordTokenFilter.decomposeInternal(Token token) |
protected void |
DictionaryCompoundWordTokenFilter.decomposeInternal(Token token) |
protected abstract void |
CompoundWordTokenFilterBase.decomposeInternal(Token token) |
Token |
CompoundWordTokenFilterBase.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
GermanStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
GermanStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
GreekLowerCaseFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
GreekLowerCaseFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
FrenchStemFilter.next(Token reusableToken) |
Token |
ElisionFilter.next(Token reusableToken)
Returns the next input Token with term() without elisioned start
|
Modifier and Type | Method and Description |
---|---|
Token |
FrenchStemFilter.next(Token reusableToken) |
Token |
ElisionFilter.next(Token reusableToken)
Returns the next input Token with term() without elisioned start
|
Modifier and Type | Method and Description |
---|---|
Token |
SingleTokenTokenStream.getToken() |
Token |
SingleTokenTokenStream.next(Token reusableToken) |
Token |
EmptyTokenStream.next(Token reusableToken) |
Token |
PrefixAndSuffixAwareTokenFilter.next(Token reusableToken) |
Token |
PrefixAwareTokenFilter.next(Token reusableToken) |
Token |
PrefixAndSuffixAwareTokenFilter.updateInputToken(Token inputToken,
Token lastPrefixToken) |
Token |
PrefixAndSuffixAwareTokenFilter.updateSuffixToken(Token suffixToken,
Token lastInputToken) |
Token |
PrefixAwareTokenFilter.updateSuffixToken(Token suffixToken,
Token lastPrefixToken)
The default implementation adds last prefix token end offset to the suffix token start and end offsets.
|
Modifier and Type | Method and Description |
---|---|
Token |
SingleTokenTokenStream.next(Token reusableToken) |
Token |
EmptyTokenStream.next(Token reusableToken) |
Token |
PrefixAndSuffixAwareTokenFilter.next(Token reusableToken) |
Token |
PrefixAwareTokenFilter.next(Token reusableToken) |
void |
SingleTokenTokenStream.setToken(Token token) |
Token |
PrefixAndSuffixAwareTokenFilter.updateInputToken(Token inputToken,
Token lastPrefixToken) |
Token |
PrefixAndSuffixAwareTokenFilter.updateSuffixToken(Token suffixToken,
Token lastInputToken) |
Token |
PrefixAwareTokenFilter.updateSuffixToken(Token suffixToken,
Token lastPrefixToken)
The default implementation adds last prefix token end offset to the suffix token start and end offsets.
|
Constructor and Description |
---|
SingleTokenTokenStream(Token token) |
Modifier and Type | Method and Description |
---|---|
Token |
EdgeNGramTokenFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
NGramTokenizer.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
EdgeNGramTokenizer.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
NGramTokenFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Modifier and Type | Method and Description |
---|---|
Token |
EdgeNGramTokenFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
NGramTokenizer.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
EdgeNGramTokenizer.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
NGramTokenFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Modifier and Type | Method and Description |
---|---|
Token |
DutchStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
DutchStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
NumericPayloadTokenFilter.next(Token reusableToken) |
Token |
TypeAsPayloadTokenFilter.next(Token reusableToken) |
Token |
TokenOffsetPayloadTokenFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
NumericPayloadTokenFilter.next(Token reusableToken) |
Token |
TypeAsPayloadTokenFilter.next(Token reusableToken) |
Token |
TokenOffsetPayloadTokenFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
RussianLowerCaseFilter.next(Token reusableToken) |
Token |
RussianStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
RussianLowerCaseFilter.next(Token reusableToken) |
Token |
RussianStemFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
ShingleFilter.next(Token reusableToken) |
Token |
ShingleMatrixFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
float |
ShingleMatrixFilter.calculateShingleWeight(Token shingleToken,
List shingle,
int currentPermutationStartOffset,
List currentPermutationRows,
List currentPermuationTokens)
Evaluates the new shingle token weight.
|
abstract ShingleMatrixFilter.TokenPositioner |
ShingleMatrixFilter.TokenSettingsCodec.getTokenPositioner(Token token)
Retrieves information on how a
Token is to be inserted to a ShingleMatrixFilter.Matrix . |
ShingleMatrixFilter.TokenPositioner |
ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec.getTokenPositioner(Token token) |
ShingleMatrixFilter.TokenPositioner |
ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec.getTokenPositioner(Token token) |
ShingleMatrixFilter.TokenPositioner |
ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec.getTokenPositioner(Token token) |
abstract float |
ShingleMatrixFilter.TokenSettingsCodec.getWeight(Token token)
Have this method return 1f in order to 'disable' weights.
|
float |
ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec.getWeight(Token token) |
float |
ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec.getWeight(Token token) |
float |
ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec.getWeight(Token token)
Returns a 32 bit float from the payload, or 1f it null.
|
Token |
ShingleFilter.next(Token reusableToken) |
Token |
ShingleMatrixFilter.next(Token reusableToken) |
abstract void |
ShingleMatrixFilter.TokenSettingsCodec.setTokenPositioner(Token token,
ShingleMatrixFilter.TokenPositioner tokenPositioner)
Sets information on how a
Token is to be inserted to a ShingleMatrixFilter.Matrix . |
void |
ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec.setTokenPositioner(Token token,
ShingleMatrixFilter.TokenPositioner tokenPositioner) |
void |
ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec.setTokenPositioner(Token token,
ShingleMatrixFilter.TokenPositioner tokenPositioner) |
void |
ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec.setTokenPositioner(Token token,
ShingleMatrixFilter.TokenPositioner tokenPositioner)
Sets the TokenPositioner as token flags int value.
|
abstract void |
ShingleMatrixFilter.TokenSettingsCodec.setWeight(Token token,
float weight)
Have this method do nothing in order to 'disable' weights.
|
void |
ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec.setWeight(Token token,
float weight) |
void |
ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec.setWeight(Token token,
float weight) |
void |
ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec.setWeight(Token token,
float weight)
Stores a 32 bit float in the payload, or set it to null if 1f;
|
void |
ShingleMatrixFilter.updateToken(Token token,
List shingle,
int currentPermutationStartOffset,
List currentPermutationRows,
List currentPermuationTokens)
Final touch of a shingle token before it is passed on to the consumer from method
ShingleMatrixFilter.next(org.apache.lucene.analysis.Token) . |
Constructor and Description |
---|
ShingleMatrixFilter.Matrix.Column(Token token) |
Modifier and Type | Method and Description |
---|---|
void |
DateRecognizerSinkTokenizer.add(Token t) |
void |
TokenRangeSinkTokenizer.add(Token t) |
void |
TokenTypeSinkTokenizer.add(Token t) |
Modifier and Type | Method and Description |
---|---|
Token |
SnowballFilter.next(Token reusableToken)
Returns the next input Token, after being stemmed
|
Modifier and Type | Method and Description |
---|---|
Token |
SnowballFilter.next(Token reusableToken)
Returns the next input Token, after being stemmed
|
Modifier and Type | Method and Description |
---|---|
Token |
StandardFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
StandardTokenizer.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
StandardFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Token |
StandardTokenizer.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
ThaiWordFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
ThaiWordFilter.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
protected Token |
SynonymTokenFilter.createToken(String synonym,
Token current,
Token reusableToken)
Creates and returns a token for the given synonym of the current input
token; Override for custom (stateless or stateful) behavior, if desired.
|
Token |
SynonymTokenFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Modifier and Type | Method and Description |
---|---|
protected Token |
SynonymTokenFilter.createToken(String synonym,
Token current,
Token reusableToken)
Creates and returns a token for the given synonym of the current input
token; Override for custom (stateless or stateful) behavior, if desired.
|
Token |
SynonymTokenFilter.next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
Modifier and Type | Method and Description |
---|---|
Token |
TokenGroup.getToken(int index) |
Modifier and Type | Method and Description |
---|---|
float |
QueryScorer.getTokenScore(Token token) |
float |
Scorer.getTokenScore(Token token)
Called for each token in the current fragment
|
float |
SpanScorer.getTokenScore(Token token) |
boolean |
Fragmenter.isNewFragment(Token nextToken)
Test to see if this token from the stream should be held in a new TextFragment
|
boolean |
SimpleFragmenter.isNewFragment(Token token) |
boolean |
NullFragmenter.isNewFragment(Token token) |
boolean |
SimpleSpanFragmenter.isNewFragment(Token token) |
Modifier and Type | Method and Description |
---|---|
Token |
WikipediaTokenizer.next(Token reusableToken) |
Modifier and Type | Method and Description |
---|---|
Token |
WikipediaTokenizer.next(Token reusableToken) |
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.