public class ShingleAnalyzerWrapper extends Analyzer
Modifier and Type | Field and Description |
---|---|
protected Analyzer |
defaultAnalyzer |
protected int |
maxShingleSize |
protected boolean |
outputUnigrams |
Constructor and Description |
---|
ShingleAnalyzerWrapper()
Wraps
StandardAnalyzer . |
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer) |
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
int maxShingleSize) |
ShingleAnalyzerWrapper(int nGramSize) |
Modifier and Type | Method and Description |
---|---|
int |
getMaxShingleSize()
The max shingle (ngram) size
|
boolean |
isOutputUnigrams() |
void |
setMaxShingleSize(int maxShingleSize)
Set the maximum size of output shingles
|
void |
setOutputUnigrams(boolean outputUnigrams)
Shall the filter pass the original tokens (the "unigrams") to the output
stream?
|
TokenStream |
tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Creates a TokenStream which tokenizes all the text in the provided
Reader.
|
close, getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setPreviousTokenStream
protected Analyzer defaultAnalyzer
protected int maxShingleSize
protected boolean outputUnigrams
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int maxShingleSize)
public ShingleAnalyzerWrapper()
StandardAnalyzer
.public ShingleAnalyzerWrapper(int nGramSize)
public int getMaxShingleSize()
public void setMaxShingleSize(int maxShingleSize)
maxShingleSize
- max shingle sizepublic boolean isOutputUnigrams()
public void setOutputUnigrams(boolean outputUnigrams)
outputUnigrams
- Whether or not the filter shall pass the original
tokens to the output streampublic TokenStream tokenStream(java.lang.String fieldName, java.io.Reader reader)
Analyzer
tokenStream
in class Analyzer
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.