[GitHub] [lucene-solr] atris commented on issue #854: Shared PQ Based Early Termination for Concurrent Search

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [lucene-solr] atris commented on issue #854: Shared PQ Based Early Termination for Concurrent Search

GitBox
atris commented on issue #854: Shared PQ Based Early Termination for Concurrent Search
URL: https://github.com/apache/lucene-solr/pull/854#issuecomment-530509226
 
 
   > I think we're talking of different approaches, hence the confusion. It is correct that we can start setting the minimum score when the global count of document that we collected reaches the requested size but if the local pqs are not full you can only use the minimum minimum score.
   > So the bottom score of the minimum scores.
   > Requiring a queue to be filled completely before publishing a minimum score allows to use the maximum minimum score among the slices that have a full pq. We can mix the two approaches, switching from the minimum minimum to the maximum minimum when pqs are filled but I wonder if this is really needed since topN is a small value ? Said differently I wonder if checking the global minimum score before a single pq is filled is a premature optimization ?
   
   I see your point. So, what you are proposing is that we basically allow only the PQs that are full to publish the minimum score, and if there are multiple full PQs publishing, take the maximum of all bottom scores as the global minimum?
   
   Sounds like a fair approach -- I can post a patch to start with.
   
   My only concern being (as you rightly captured) is the case when in Top N, N is significant. In that scenario, we might be sub optimal here. But, nevertheless, this approach will satisfy a large number of usecases.
   
   RE: synchronization, what are your thoughts on using a global shared array where each collector publishes its bottom value vs message passing?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]