[jira] [Commented] (NUTCH-2582) Set pool size of XML SAX parsers used for MIME detection in Tika 1.19

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (NUTCH-2582) Set pool size of XML SAX parsers used for MIME detection in Tika 1.19

Steve Loughran (Jira)

    [ https://issues.apache.org/jira/browse/NUTCH-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17234499#comment-17234499 ]

ASF GitHub Bot commented on NUTCH-2582:
---------------------------------------

sebastian-nagel merged pull request #554:
URL: https://github.com/apache/nutch/pull/554


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[hidden email]


> Set pool size of XML SAX parsers used for MIME detection in Tika 1.19
> ---------------------------------------------------------------------
>
>                 Key: NUTCH-2582
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2582
>             Project: Nutch
>          Issue Type: Improvement
>          Components: protocol
>    Affects Versions: 1.15
>            Reporter: Sebastian Nagel
>            Assignee: Sebastian Nagel
>            Priority: Major
>             Fix For: 1.18
>
>
> See [NUTCH-2578|https://issues.apache.org/jira/browse/NUTCH-2578?focusedCommentId=16482879&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16482879]. Tika 1.19 will use a pool of SAX parser to avoid the bottleneck while creating a new one (see NUTCH-2578/TIKA-2645). Fetcher should adjust the size of the pool to the number of Fetcher threads (or a fraction of it because most threads are likely to be busy fetching content).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)