public class TeeTokenFilter extends TokenFilter
SinkTokenizer sink1 = new SinkTokenizer(null); SinkTokenizer sink2 = new SinkTokenizer(null); TokenStream source1 = new TeeTokenFilter(new TeeTokenFilter(new WhitespaceTokenizer(reader1), sink1), sink2); TokenStream source2 = new TeeTokenFilter(new TeeTokenFilter(new WhitespaceTokenizer(reader2), sink1), sink2); TokenStream final1 = new LowerCaseFilter(source1); TokenStream final2 = source2; TokenStream final3 = new EntityDetect(sink1); TokenStream final4 = new URLDetect(sink2); d.add(new Field("f1", final1)); d.add(new Field("f2", final2)); d.add(new Field("f3", final3)); d.add(new Field("f4", final4));In this example, sink1 and sink2 will both get tokens from both reader1 and reader2 after whitespace tokenizer and now we can further wrap any of these in extra analysis, and more "sources" can be inserted if desired. Note, the EntityDetect and URLDetect TokenStreams are for the example and do not currently exist in Lucene See http://issues.apache.org/jira/browse/LUCENE-1058
SinkTokenizer
input
Constructor and Description |
---|
TeeTokenFilter(TokenStream input,
SinkTokenizer sink) |
Modifier and Type | Method and Description |
---|---|
Token |
next(Token reusableToken)
Returns the next token in the stream, or null at EOS.
|
close, reset
next
public TeeTokenFilter(TokenStream input, SinkTokenizer sink)
public Token next(Token reusableToken) throws IOException
TokenStream
This implicitly defines a "contract" between consumers (callers of this method) and producers (implementations of this method that are the source for tokens):
Token.clear()
before setting the fields in it & returning itTokenFilter
is considered a consumer.next
in class TokenStream
reusableToken
- a Token that may or may not be used to
return; this parameter should never be null (the callee
is not required to check for null before using it, but it is a
good idea to assert that it is not null.)IOException
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.