[jira] [Commented] (LUCENE-4857) StemmerOverrideFilter should not copy the stem override dictionary in it's ctor.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-4857) StemmerOverrideFilter should not copy the stem override dictionary in it's ctor.

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13607640#comment-13607640 ]

Robert Muir commented on LUCENE-4857:
-------------------------------------

+1

I think we should do this for 4.2.1, but change this guy to use FST for 4.3

So if someone has a big dictionary, it won't eat up tons of RAM, and also enforces immutability.

It means its factory must do a little more work but I think thats ok.
               

> StemmerOverrideFilter should not copy the stem override dictionary in it's ctor.
> --------------------------------------------------------------------------------
>
>                 Key: LUCENE-4857
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4857
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 4.0, 4.1, 4.2
>            Reporter: Simon Willnauer
>            Priority: Minor
>             Fix For: 5.0, 4.2.1
>
>         Attachments: LUCENE-4857.patch
>
>
> Currently the dictionary is cloned each time the token filter is created which is a serious bottleneck if you use this filter with large dictionaries and can also lead to OOMs if lots of those filters sit in ThreadLocals and new threads are added etc. I think cloning the map should be done in the analyzer (which all of our analyzers do btw. but this is the only TF that does that) no need to really copy that map.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]