[GitHub] [lucene-solr] jimczi commented on issue #854: Shared PQ Based Early Termination for Concurrent Search

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GitHub] [lucene-solr] jimczi commented on issue #854: Shared PQ Based Early Termination for Concurrent Search

GitBox
jimczi commented on issue #854: Shared PQ Based Early Termination for Concurrent Search
URL: https://github.com/apache/lucene-solr/pull/854#issuecomment-530398602
 
 
   > Thanks for offering. I have a skeletal PR in flight for this approach that I plan to publish tomorrow -- maybe we can iterate on that?
   
   Sure, thanks
   
   > I am sure I am missing something here. If the user requested top N hits, then all slices can keep collecting hits in their thread local PQs and update a global counter to reflect if total hits collected globally has reached N. Once we have reached N globally, each collector can publish the value of the bottom of their thread local PQ. The minimum of all such values will be our global minimum score, since we know that, collectively, we have N hits available. Post that, all collectors will use the global minimum score to filter hits. If, a collector finds a competitive hit, it adds it to the local queue, updates its local minimum score and triggers a resync, where the minimum of all minimum scores (if that makes sense) is taken and kept as the global worst hit.
   
   I think it would be simpler to keep the maximum minimum score on each slice. Each time a slice publish a new minimum score we can broadcast a listener to all the other top docs collector that would update their local minimum score if needed. Synchronization shouldn't be the bottleneck here but happy to be proven wrong.
   The global counter of total hits must be reached to publish any minimum score but the publisher must also ensure that his local pq is full before publishing since it is possible to reach the total hits threshold while none of the local pq are completely filled so this would break the contract.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[hidden email]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]