com.theloutons.search.specialreaders
Interface DocumentAnalyze

All Known Implementing Classes:
CSVReader, DOCReader, HTMLReader, PDFReader, PPTReader, RTFReader, TXTReader, XLSReader, XMLReader

public interface DocumentAnalyze

Author:
Tom Louton This is to define the way in which new file formats can be added.

Method Summary
 org.apache.lucene.document.Document getDocument()
          The just returns the document created.
 void setFile(java.io.File f, java.io.PrintWriter log)
          This sets the files and does the extraction.
 

Method Detail

getDocument

public org.apache.lucene.document.Document getDocument()
The just returns the document created.

Returns:
the lucene document with the text extracted from the file f (below) was extracted.

setFile

public void setFile(java.io.File f,
                    java.io.PrintWriter log)
This sets the files and does the extraction. Of course, one could use the getDocument to do the extraction too.

Parameters:
f - the file from which the tokens are to be extracted.
log - a log file. I suggest that where ever a doc=null;return is done, write an reason to the log.