[jira] Resolved: (TIKA-125) Pass Locale information to parsers

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] Resolved: (TIKA-125) Pass Locale information to parsers

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/TIKA-125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-125.

       Resolution: Fixed
    Fix Version/s: 0.6
         Assignee: Jukka Zitting

The parse context mechanism is perfect for this need. Use the following code to specify the default locale to be used when formatting data from documents like Excel sheets that don't contain explicit locale information:

    Parser parser = ...;
    ParseContext context = new ParseContext();
    context.set(Locale.class, myLocale);
    parser.parse(..., context);

> Pass Locale information to parsers
> ----------------------------------
>                 Key: TIKA-125
>                 URL: https://issues.apache.org/jira/browse/TIKA-125
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>             Fix For: 0.6
> Looking at TIKA-103 I realized that some file formats can contain data whose text rendering depends on the active Locale which might not be explicitly specified in the file format or the specific document being parsed.
> It should be possible for a parser client to explicitly specify which Locale should be used as the default when extracting text from a document. Setting the global default with Locale.setLocale() is not an option in many cases.
> I think the best way to pass Locale information to a parser is as a part of the Metadata object.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.