Quantcast

[jira] [Commented] (TIKA-1332) Create "eval" code

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[jira] [Commented] (TIKA-1332) Create "eval" code

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/TIKA-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15867992#comment-15867992 ]

Tim Allison commented on TIKA-1332:
-----------------------------------

Are there any licensing objections to adding a dependency in the tika-eval module for the H2 database?  This is dual licensed MPL2.0 and EPL 1.0.  These are both "weak copyleft" and should be ok if we document them according to https://www.apache.org/legal/resolved#category-b.

As a side note, this dependency will only exist for the tika-eval module, not for any of the other modules.

> Create "eval" code
> ------------------
>
>                 Key: TIKA-1332
>                 URL: https://issues.apache.org/jira/browse/TIKA-1332
>             Project: Tika
>          Issue Type: Sub-task
>          Components: cli, general, server
>            Reporter: Tim Allison
>         Attachments: comparison_reports.xml
>
>
> For this issue, we can start with code to gather statistics on each run (# of exceptions per file type, most common exceptions per file type, number of metadata items, total text extracted, etc).  We should also be able to compare one run against another.  Going forward, there's plenty of room to improve.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
Loading...