Re: [jira] [Commented] (LUCENE-9286) FST arc.copyOf clones BitTables and this can lead to excessive memory use

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: [jira] [Commented] (LUCENE-9286) FST arc.copyOf clones BitTables and this can lead to excessive memory use

Michael Sokolov-4
Oh good catch! Thanks for digging, Adrien. We had had reports of our
JP indexes taking longer to build (not anything like 6x, but
noticeable - I guess analysis is only part of the time).

On Mon, May 25, 2020 at 3:54 AM Adrien Grand (Jira) <[hidden email]> wrote:

>
>
>     [ https://issues.apache.org/jira/browse/LUCENE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17115802#comment-17115802 ]
>
> Adrien Grand commented on LUCENE-9286:
> --------------------------------------
>
> FYI I was just digging a Kuromoji regression introduced in 8.4 that made analysis run about 6x slower. Interestingly the slowdown was on both branch_8_4 and branch 8_5 but not on branch_8x and git bisect pointed out to this commit as the fix of the regression.
>
> > FST arc.copyOf clones BitTables and this can lead to excessive memory use
> > -------------------------------------------------------------------------
> >
> >                 Key: LUCENE-9286
> >                 URL: https://issues.apache.org/jira/browse/LUCENE-9286
> >             Project: Lucene - Core
> >          Issue Type: Bug
> >    Affects Versions: 8.5
> >            Reporter: Dawid Weiss
> >            Assignee: Bruno Roustant
> >            Priority: Major
> >             Fix For: 8.6
> >
> >         Attachments: screen-[1].png
> >
> >          Time Spent: 1h 50m
> >  Remaining Estimate: 0h
> >
> > I see a dramatic increase in the amount of memory required for construction of (arguably large) automata. It currently OOMs with 8GB of memory consumed for bit tables. I am pretty sure this didn't require so much memory before (the automaton is ~50MB after construction).
> > Something bad happened in between. Thoughts, [~broustant], [~sokolov]?
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]