[jira] [Created] (LUCENE-8824) TestTopDocsMerge Is Broken

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Created] (LUCENE-8824) TestTopDocsMerge Is Broken

Michael Gibney (Jira)
Atri Sharma created LUCENE-8824:

             Summary: TestTopDocsMerge Is Broken
                 Key: LUCENE-8824
                 URL: https://issues.apache.org/jira/browse/LUCENE-8824
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Atri Sharma

Investigating a test failure post-LUCENE-8757, I realized that TestTopDocsMerge takes a non-obvious invariant on the fact that the number of Collectors involved in the merge will be equal to the number of LeafReaderContexts originally present. This is propagated in the corresponding ScoreDocs's shardIndex fields, which can lead to subtle issues since shardIndex is used for tie-breaking in the priority queue used during the merge. I believe that this is a dangerous and unnecessary dependency to take since the IndexSearcher#slices method does not advertise any such guarantees.


The underlying assumption worked well in the past since the default slice allocation algorithm always allocated a slice per segment, thus guaranteeing that the number of Collectors (== number of Slices) will be equal to the number of Leaf Contexts. With 8757, this is no longer true.


I propose a rewrite of the test, where ShardSearcher is allowed to take a LeafSlice instance and can internally do a sequential search on the leaf contexts of the passed in the slice. This will allow TestTopDocsMerge to create N subsearchers where N is equal to the number of slices used by the IndexSearcher being compared to.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]