[jira] Created: (TIKA-99) Support external parser programs

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (TIKA-99) Support external parser programs

JIRA jira@apache.org
Support external parser programs
--------------------------------

                 Key: TIKA-99
                 URL: https://issues.apache.org/jira/browse/TIKA-99
             Project: Tika
          Issue Type: New Feature
            Reporter: Jukka Zitting
            Priority: Minor


There should be a parser component (like ExternalParser) that invokes an external command line application, feeds the given document as input to the application, and returns the output from the application as the extracted text (or xhtml) content. This would allow integration with tools like catdoc or pdf2txt.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (TIKA-99) Support external parser programs

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/TIKA-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated TIKA-99:
----------------------------------

    Component/s: parser

> Support external parser programs
> --------------------------------
>
>                 Key: TIKA-99
>                 URL: https://issues.apache.org/jira/browse/TIKA-99
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Priority: Minor
>
> There should be a parser component (like ExternalParser) that invokes an external command line application, feeds the given document as input to the application, and returns the output from the application as the extracted text (or xhtml) content. This would allow integration with tools like catdoc or pdf2txt.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (TIKA-99) Support external parser programs

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/TIKA-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-99.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 0.2-incubating
         Assignee: Jukka Zitting

ExternalParser class implemented in revision 676141.

> Support external parser programs
> --------------------------------
>
>                 Key: TIKA-99
>                 URL: https://issues.apache.org/jira/browse/TIKA-99
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 0.2-incubating
>
>
> There should be a parser component (like ExternalParser) that invokes an external command line application, feeds the given document as input to the application, and returns the output from the application as the extracted text (or xhtml) content. This would allow integration with tools like catdoc or pdf2txt.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.