[jira] [Commented] (TIKA-3009) XML Parser reset() detection no working in weblogic

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (TIKA-3009) XML Parser reset() detection no working in weblogic

Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/TIKA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994601#comment-16994601 ]

Daniel commented on TIKA-3009:

Thanks for the quick reply. I don't think that calling reset twice would solve the problem. The underlying parser is only created when the parser was "used" and only after that a real "reset" is performed.

Quickest fix would be to catch an eventual UnsupportedOperationException even though it is not really expected as the test revealed support for this feature. Either in the PoolParser's reset() method or in releaseParser() (as it was done up to tika 1.19).

> XML Parser reset() detection no working in weblogic
> ------------------------------------------------------------
>                 Key: TIKA-3009
>                 URL: https://issues.apache.org/jira/browse/TIKA-3009
>             Project: Tika
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.20, 1.21, 1.22, 1.23
>         Environment: JDK 1.8.0_231
> Oracle Weblogic Server
>            Reporter: Daniel
>            Priority: Critical
> Starting with tika 1.20 the org.apache.tika.utils.XMLReaderUtils try to detect if a XML parser supports the reset() functionality by calling reset() during the poolParser creation and watching for a UnsupportedOperationException.
> This unfortunately does not work in weblogic server as the attained RegistryParser itself caches underlying SAX parsers. Only after first use the reset() of the underlying SAXParser is called and will produce the UnsupportedOperationException. A first call to reset() will not produce this exception and XMLReaderUtils thinks, the parser supports reset() which in effect is not true.
> This results in exhaustion of the parser pool and intermittent errors and delays in processing as the pool is reset when a parser is not available after 5 minutes.

This message was sent by Atlassian Jira