public class TrecDocMaker extends BasicDocMaker
Config properties:
Modifier and Type | Field and Description |
---|---|
protected File |
dataDir |
protected ThreadLocal |
dateFormat |
protected ArrayList |
inputFiles |
protected int |
iteration |
protected int |
nextFile |
protected BufferedReader |
reader |
BODY_FIELD, BYTES_FIELD, config, DATE_FIELD, forever, ID_FIELD, indexVal, NAME_FIELD, storeVal, termVecVal, TITLE_FIELD
Constructor and Description |
---|
TrecDocMaker() |
Modifier and Type | Method and Description |
---|---|
protected void |
closeInputs() |
protected DateFormat |
getDateFormat(int n) |
protected DocData |
getNextDocData()
Return the data of the next document.
|
int |
numUniqueTexts()
Return how many real unique texts are available, 0 if not applicable.
|
protected void |
openNextFile() |
protected Date |
parseDate(String dateStr) |
protected StringBuffer |
read(String prefix,
StringBuffer sb,
boolean collectMatchLine,
boolean collectAll) |
void |
resetInputs()
Reset inputs so that the test run would behave, input wise, as if it just started.
|
void |
setConfig(Config config)
Set the properties
|
addBytes, addUniqueBytes, collectFiles, getByteCount, getCount, getHtmlParser, makeDocument, makeDocument, numUniqueBytes, printDocStatistics, resetUniqueBytes, setHTMLParser
protected ThreadLocal dateFormat
protected File dataDir
protected ArrayList inputFiles
protected int nextFile
protected int iteration
protected BufferedReader reader
public void setConfig(Config config)
DocMaker
setConfig
in interface DocMaker
setConfig
in class BasicDocMaker
protected void openNextFile() throws NoMoreDataException, Exception
NoMoreDataException
Exception
protected void closeInputs()
protected StringBuffer read(String prefix, StringBuffer sb, boolean collectMatchLine, boolean collectAll) throws Exception
Exception
protected DocData getNextDocData() throws NoMoreDataException, Exception
BasicDocMaker
getNextDocData
in class BasicDocMaker
NoMoreDataException
- if data is exhausted (and 'forever' set to false).Exception
protected DateFormat getDateFormat(int n)
public void resetInputs()
DocMaker
resetInputs
in interface DocMaker
resetInputs
in class BasicDocMaker
public int numUniqueTexts()
DocMaker
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.