[jira] [Commented] (TIKA-1332) Create tika-eval module

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (TIKA-1332) Create tika-eval module

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/TIKA-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870661#comment-15870661 ]

Hudson commented on TIKA-1332:

SUCCESS: Integrated in Jenkins build tika-2.x #217 (See [https://builds.apache.org/job/tika-2.x/217/])
TIKA-1332 3rd time's the charm.  Fix dependencies with IOUtils. (tallison: rev 61532258f2ff44787050f0f3a0bb8ba17d8e50b0)
* (edit) tika-eval/src/main/java/org/apache/tika/eval/io/DBWriter.java
* (edit) tika-eval/src/main/java/org/apache/tika/eval/io/ExtractReader.java
* (edit) tika-eval/src/main/java/org/apache/tika/eval/tokens/TokenIntPair.java
* (edit) tika-eval/src/main/java/org/apache/tika/eval/io/XMLLogReader.java
* (edit) tika-eval/src/main/java/org/apache/tika/eval/XMLErrorLogUpdater.java
* (edit) tika-eval/src/main/java/org/apache/tika/eval/db/DBUtil.java

> Create tika-eval module
> -----------------------
>                 Key: TIKA-1332
>                 URL: https://issues.apache.org/jira/browse/TIKA-1332
>             Project: Tika
>          Issue Type: Sub-task
>          Components: cli, general, server
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>             Fix For: 2.0, 1.15
>         Attachments: comparison_reports.xml
> For this issue, we can start with code to gather statistics on each run (# of exceptions per file type, most common exceptions per file type, number of metadata items, total text extracted, etc).  We should also be able to compare one run against another.  Going forward, there's plenty of room to improve.

This message was sent by Atlassian JIRA