public class QueryAutoStopWordAnalyzer extends Analyzer
Modifier and Type | Field and Description |
---|---|
static float |
defaultMaxDocFreqPercent |
Constructor and Description |
---|
QueryAutoStopWordAnalyzer(Analyzer delegate)
Initializes this analyzer with the Analyzer object that actual produces the tokens
|
Modifier and Type | Method and Description |
---|---|
int |
addStopWords(IndexReader reader)
Automatically adds stop words for all fields with terms exceeding the defaultMaxDocFreqPercent
|
int |
addStopWords(IndexReader reader,
float maxPercentDocs)
Automatically adds stop words for all fields with terms exceeding the maxDocFreqPercent
|
int |
addStopWords(IndexReader reader,
int maxDocFreq)
Automatically adds stop words for all fields with terms exceeding the maxDocFreqPercent
|
int |
addStopWords(IndexReader reader,
String fieldName,
float maxPercentDocs)
Automatically adds stop words for the given field with terms exceeding the maxPercentDocs
|
int |
addStopWords(IndexReader reader,
String fieldName,
int maxDocFreq)
Automatically adds stop words for the given field with terms exceeding the maxPercentDocs
|
Term[] |
getStopWords()
Provides information on which stop words have been identified for all fields
|
String[] |
getStopWords(String fieldName)
Provides information on which stop words have been identified for a field
|
TokenStream |
tokenStream(String fieldName,
Reader reader)
Creates a TokenStream which tokenizes all the text in the provided
Reader.
|
close, getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setPreviousTokenStream
public static final float defaultMaxDocFreqPercent
public QueryAutoStopWordAnalyzer(Analyzer delegate)
delegate
- The choice of analyzer that is used to produce the token stream which needs filteringpublic int addStopWords(IndexReader reader) throws IOException
reader
- The IndexReader class which will be consulted to identify potential stop words that
exceed the required document frequencyIOException
public int addStopWords(IndexReader reader, int maxDocFreq) throws IOException
reader
- The IndexReader class which will be consulted to identify potential stop words that
exceed the required document frequencymaxDocFreq
- The maximum number of index documents which can contain a term, after which
the term is considered to be a stop wordIOException
public int addStopWords(IndexReader reader, float maxPercentDocs) throws IOException
reader
- The IndexReader class which will be consulted to identify potential stop words that
exceed the required document frequencymaxPercentDocs
- The maximum percentage (between 0.0 and 1.0) of index documents which
contain a term, after which the word is considered to be a stop word.IOException
public int addStopWords(IndexReader reader, String fieldName, float maxPercentDocs) throws IOException
reader
- The IndexReader class which will be consulted to identify potential stop words that
exceed the required document frequencyfieldName
- The field for which stopwords will be addedmaxPercentDocs
- The maximum percentage (between 0.0 and 1.0) of index documents which
contain a term, after which the word is considered to be a stop word.IOException
public int addStopWords(IndexReader reader, String fieldName, int maxDocFreq) throws IOException
reader
- The IndexReader class which will be consulted to identify potential stop words that
exceed the required document frequencyfieldName
- The field for which stopwords will be addedmaxDocFreq
- The maximum number of index documents which
can contain a term, after which the term is considered to be a stop word.IOException
public TokenStream tokenStream(String fieldName, Reader reader)
Analyzer
tokenStream
in class Analyzer
public String[] getStopWords(String fieldName)
fieldName
- The field for which stop words identified in "addStopWords"
method calls will be returnedpublic Term[] getStopWords()
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.