Review Request 31758: TIKA-1330: tika batch code

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Review Request 31758: TIKA-1330: tika batch code

Tim Allison

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/
-----------------------------------------------------------

Review request for tika.


Repository: tika


Description
-------

TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing code filesystem input -> filesystem output on a single machine.  The goals are:

1) to make the code robust against permanent hangs and oom
2) enable easy(ish) extensibility
3) include robust logging


Diffs
-----

  trunk/pom.xml 1664211
  trunk/tika-app/pom.xml 1664211
  trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION
  trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211
  trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION
  trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211
  trunk/tika-batch/pom.xml PRE-CREATION
  trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION
  trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION
  trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION
  trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION
  trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION
  trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211
  trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211
  trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211

Diff: https://reviews.apache.org/r/31758/diff/


Testing
-------

Code has been in development as part of another fielded project for the last two years.  Numerous unit tests...could always use more


Thanks,

Tim Allison

Reply | Threaded
Open this post in threaded view
|

Re: Review Request 31758: TIKA-1330: tika batch code

Chris Mattmann

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/#review75289
-----------------------------------------------------------



trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java
<https://reviews.apache.org/r/31758/#comment122268>

    awesome! :) I wrote that one.



trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java
<https://reviews.apache.org/r/31758/#comment122269>

    Why not use commons-io? http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html


- Chris Mattmann


On March 5, 2015, 3:07 a.m., Tim Allison wrote:

>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31758/
> -----------------------------------------------------------
>
> (Updated March 5, 2015, 3:07 a.m.)
>
>
> Review request for tika.
>
>
> Repository: tika
>
>
> Description
> -------
>
> TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing code filesystem input -> filesystem output on a single machine.  The goals are:
>
> 1) to make the code robust against permanent hangs and oom
> 2) enable easy(ish) extensibility
> 3) include robust logging
>
>
> Diffs
> -----
>
>   trunk/pom.xml 1664211
>   trunk/tika-app/pom.xml 1664211
>   trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION
>   trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211
>   trunk/tika-batch/pom.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION
>   trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211
>
> Diff: https://reviews.apache.org/r/31758/diff/
>
>
> Testing
> -------
>
> Code has been in development as part of another fielded project for the last two years.  Numerous unit tests...could always use more
>
>
> Thanks,
>
> Tim Allison
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Review Request 31758: TIKA-1330: tika batch code

Tim Allison


> On March 5, 2015, 3:56 a.m., Chris Mattmann wrote:
> > trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java, line 260
> > <https://reviews.apache.org/r/31758/diff/1/?file=885352#file885352line260>
> >
> >     awesome! :) I wrote that one.

Fixed oap->oa  :)


- Tim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/#review75289
-----------------------------------------------------------


On March 5, 2015, 3:07 a.m., Tim Allison wrote:

>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31758/
> -----------------------------------------------------------
>
> (Updated March 5, 2015, 3:07 a.m.)
>
>
> Review request for tika.
>
>
> Repository: tika
>
>
> Description
> -------
>
> TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing code filesystem input -> filesystem output on a single machine.  The goals are:
>
> 1) to make the code robust against permanent hangs and oom
> 2) enable easy(ish) extensibility
> 3) include robust logging
>
>
> Diffs
> -----
>
>   trunk/pom.xml 1664211
>   trunk/tika-app/pom.xml 1664211
>   trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION
>   trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211
>   trunk/tika-batch/pom.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION
>   trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211
>
> Diff: https://reviews.apache.org/r/31758/diff/
>
>
> Testing
> -------
>
> Code has been in development as part of another fielded project for the last two years.  Numerous unit tests...could always use more
>
>
> Thanks,
>
> Tim Allison
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Review Request 31758: TIKA-1330: tika batch code

Tyler Palsulich
In reply to this post by Tim Allison

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/#review75564
-----------------------------------------------------------



trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java
<https://reviews.apache.org/r/31758/#comment122751>

    Should this be/is this a configurable value?



trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java
<https://reviews.apache.org/r/31758/#comment122754>

    +1 to log the error.


- Tyler Palsulich


On March 5, 2015, 3:07 a.m., Tim Allison wrote:

>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31758/
> -----------------------------------------------------------
>
> (Updated March 5, 2015, 3:07 a.m.)
>
>
> Review request for tika.
>
>
> Repository: tika
>
>
> Description
> -------
>
> TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing code filesystem input -> filesystem output on a single machine.  The goals are:
>
> 1) to make the code robust against permanent hangs and oom
> 2) enable easy(ish) extensibility
> 3) include robust logging
>
>
> Diffs
> -----
>
>   trunk/pom.xml 1664211
>   trunk/tika-app/pom.xml 1664211
>   trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION
>   trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211
>   trunk/tika-batch/pom.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION
>   trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211
>
> Diff: https://reviews.apache.org/r/31758/diff/
>
>
> Testing
> -------
>
> Code has been in development as part of another fielded project for the last two years.  Numerous unit tests...could always use more
>
>
> Thanks,
>
> Tim Allison
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Review Request 31758: TIKA-1330: tika batch code

Chris Mattmann
In reply to this post by Tim Allison

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/#review75632
-----------------------------------------------------------

Ship it!


Ship It!

- Chris Mattmann


On March 5, 2015, 3:07 a.m., Tim Allison wrote:

>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31758/
> -----------------------------------------------------------
>
> (Updated March 5, 2015, 3:07 a.m.)
>
>
> Review request for tika.
>
>
> Repository: tika
>
>
> Description
> -------
>
> TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing code filesystem input -> filesystem output on a single machine.  The goals are:
>
> 1) to make the code robust against permanent hangs and oom
> 2) enable easy(ish) extensibility
> 3) include robust logging
>
>
> Diffs
> -----
>
>   trunk/pom.xml 1664211
>   trunk/tika-app/pom.xml 1664211
>   trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION
>   trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211
>   trunk/tika-batch/pom.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION
>   trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211
>
> Diff: https://reviews.apache.org/r/31758/diff/
>
>
> Testing
> -------
>
> Code has been in development as part of another fielded project for the last two years.  Numerous unit tests...could always use more
>
>
> Thanks,
>
> Tim Allison
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Review Request 31758: TIKA-1330: tika batch code

Tim Allison


> On March 7, 2015, 2:22 p.m., Chris Mattmann wrote:
> > Ship It!

Will do probably by the end of this week.  There's still a bunch of small clean up stuff I want to do.  The one fairly big change is to configure a sane default logging strategy from tika-app to dump logs to a directory called "tika-batch-logs" instead of dumping everything to stdout.


- Tim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/#review75632
-----------------------------------------------------------


On March 5, 2015, 3:07 a.m., Tim Allison wrote:

>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/31758/
> -----------------------------------------------------------
>
> (Updated March 5, 2015, 3:07 a.m.)
>
>
> Review request for tika.
>
>
> Repository: tika
>
>
> Description
> -------
>
> TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing code filesystem input -> filesystem output on a single machine.  The goals are:
>
> 1) to make the code robust against permanent hangs and oom
> 2) enable easy(ish) extensibility
> 3) include robust logging
>
>
> Diffs
> -----
>
>   trunk/pom.xml 1664211
>   trunk/tika-app/pom.xml 1664211
>   trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION
>   trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION
>   trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211
>   trunk/tika-batch/pom.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION
>   trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION
>   trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java PRE-CREATION
>   trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION
>   trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION
>   trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211
>   trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211
>
> Diff: https://reviews.apache.org/r/31758/diff/
>
>
> Testing
> -------
>
> Code has been in development as part of another fielded project for the last two years.  Numerous unit tests...could always use more
>
>
> Thanks,
>
> Tim Allison
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Review Request 31758: TIKA-1330: tika batch code

Tim Allison
In reply to this post by Tim Allison

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/31758/
-----------------------------------------------------------

(Updated March 9, 2015, 11:07 a.m.)


Review request for tika.


Repository: tika


Description
-------

TIKA-1330: tika batch code integrated into Tika-app.  This offers robust batch processing code filesystem input -> filesystem output on a single machine.  The goals are:

1) to make the code robust against permanent hangs and oom
2) enable easy(ish) extensibility
3) include robust logging


Diffs
-----

  trunk/pom.xml 1664211
  trunk/tika-app/pom.xml 1664211
  trunk/tika-app/src/main/java/org/apache/tika/cli/BatchCommandLineBuilder.java PRE-CREATION
  trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java 1664211
  trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLIBatchCommandLineTest.java PRE-CREATION
  trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java 1664211
  trunk/tika-batch/pom.xml PRE-CREATION
  trunk/tika-batch/src/main/examples/batchExecutor.sh PRE-CREATION
  trunk/tika-batch/src/main/examples/log4j.xml PRE-CREATION
  trunk/tika-batch/src/main/examples/log4j_driver.xml PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/AutoDetectParserFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchNoRestartError.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcess.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/BatchProcessDriverCLI.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/CommandLineInterrupter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/ConsumersManager.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileConsumerFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResource.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceConsumer.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawler.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileResourceCrawlerFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/FileStarted.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/IFileProcessorFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/IInterrupter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/IStatusReporter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/InterrupterFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/OutputStreamFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/ParallelFileProcessingResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/ParserFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/PoisonFileResource.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/SimpleLogStatusReporter.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/StatusReporterFutureResult.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/AbstractConsumersBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/BatchProcessBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineInterrupterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/CommandLineParserBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ContentHandlerFactoryBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/DefaultContentHandlerFactoryBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ICrawlerBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/IInterupterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMAndQueueBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ObjectFromDOMBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/ReporterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/SimpleLogReporterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/builders/StatusReporterBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/BasicTikaFSConsumer.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSBatchProcessCLI.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSConsumersManager.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDirectoryCrawler.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSDocumentSelector.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSFileResource.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSListCrawler.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSOutputStreamFactory.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSProperties.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/FSUtil.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/RecursiveParserWrapperFSConsumer.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders/FSCrawlerBuilder.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/strawman/StrawManTikaAppDriver.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/BatchLocalization.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/ClassLoaderUtil.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/DurationFormatUtils.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/PropsUtil.java PRE-CREATION
  trunk/tika-batch/src/main/java/org/apache/tika/util/XMLDOMUtil.java PRE-CREATION
  trunk/tika-batch/src/main/resources/org/apache/tika/batch/fs/default-tika-batch-config.xml PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/CommandLineParserBuilderTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchDriverTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/BatchProcessTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/FSBatchTestBase.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/HandlerBuilderTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/OutputStreamFactoryTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/StringStreamGobbler.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/batch/fs/strawman/StrawmanTest.java PRE-CREATION
  trunk/tika-batch/src/test/java/org/apache/tika/parser/evil/EvilParserFactory.java PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/assertion_error.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/hang_heavy_load.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/no_problem.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/oom_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/runtime_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/sleep.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/sleep_2000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/tika-evil-config.xml PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/tika-evil-mimetypes.xml PRE-CREATION
  trunk/tika-batch/src/test/resources/evil/tika_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/log4j.properties PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/basic/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load1.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load2.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load3.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load4.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/hang_heavy_load5.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/heavy_heavy_hangs/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/no_restart/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/no_restart/test2.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/no_restart/test3.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/hang_heavy_load1.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test2.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test3.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/one_heavy_hang/test4.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/asleep_10000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/hang_heavy_load.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test1.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test1b_oom_exception.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test2.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test3.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/oom/test4.txt PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/timeout_after_early_termination/asleep_60000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/test-input/wait_after_early_termination/asleep_10000.evil PRE-CREATION
  trunk/tika-batch/src/test/resources/tika-batch-config-basic-test.xml PRE-CREATION
  trunk/tika-batch/src/test/resources/tika-batch-config-evil-test.xml PRE-CREATION
  trunk/tika-core/src/main/java/org/apache/tika/io/IOUtils.java 1664211
  trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java 1664211
  trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java 1664211

Diff: https://reviews.apache.org/r/31758/diff/


Testing
-------

Code has been in development as part of another fielded project for the last two years.  Numerous unit tests...could always use more


Thanks,

Tim Allison