[jira] [Commented] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (LUCENE-8759) BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-8759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16819305#comment-16819305 ]

Adrien Grand commented on LUCENE-8759:

Maybe the test could explicitly test both normal and denormal floats all the time? Otherwise +1.

I'm curious whether this makes any difference when running luceneutil? 

> BlockMaxConjunctionScorer's simplified way of computing max scores hurts performance
> ------------------------------------------------------------------------------------
>                 Key: LUCENE-8759
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8759
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-8759.patch
> BlockMaxConjunctionScorer computes the minimum value that the score should have after each scorer in order to be able to interrupt scorer as soon as possible. For instance say scorers A, B and C produce maximum scores that are equal to 4, 2 and 1. If the minimum competitive score is X, then the score after scoring A, B and C must be at least X, the score after scoring A and B must be at least X-1 and the score after scoring A must be at least X-1-2.
> However this is made a bit more complex than that due to floating-point numbers and the fact that intermediate score values are doubles which only get casted to a float after all values have been summed up. In order to keep things simple, BlockMaxConjunctionScore has the following comment and code
> {code}
>         // Also compute the minimum required scores for a hit to be competitive
>         // A double that is less than 'score' might still be converted to 'score'
>         // when casted to a float, so we go to the previous float to avoid this issue
>         minScores[minScores.length - 1] = minScore > 0 ? Math.nextDown(minScore) : 0;
> {code}
> It simplifies the problem by calling Math.nextDown(minScore). However this is problematic because it defeats the fact that TopScoreDocCollector calls setMinCompetitiveScore on the float value that is immediately greater than the k-th greatest hit so far.
> nextDown(minScore) is not the value that we need. The value that we need is the smallest double that converts to minScore when casted to a float, which would be half-way between nextDown(minScore) and minScore. In some cases this would help get better performance out of conjunctions, especially if some clauses produce constant scores.
> MaxScoreSumPropagator#setMinCompetitiveScore has the same issue.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]