[jira] [Created] (LUCENE-8319) A Time-limiting collector that works with CollectorManagers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (LUCENE-8319) A Time-limiting collector that works with CollectorManagers

JIRA jira@apache.org
Tony Xu created LUCENE-8319:
-------------------------------

             Summary: A Time-limiting collector that works with CollectorManagers
                 Key: LUCENE-8319
                 URL: https://issues.apache.org/jira/browse/LUCENE-8319
             Project: Lucene - Core
          Issue Type: Improvement
          Components: core/search
            Reporter: Tony Xu


Currently Lucene has *TimeLimitingCollector* to support time-bound collection and it will throw 
*TimeExceededException* if timeout happens. This only works nicely with the single-thread low-level API from the IndexSearcher. The method signature is --

*void search(List<LeafReaderContext> leaves, Weight weight, Collector collector)*

The intended use is to always enclose the searcher.search(query, collector) call with a try ... catch and handle the timeout exception. Unfortunately when working with a *CollectorManager* in the multi-thread search context, the *TimeExceededException* thrown during collecting one leaf slice will be re-thrown by *IndexSearcher* without calling *CollectorManager*'s reduce(), even if other slices are successfully collected. The signature
of the search api with *CollectorManager* is --

*<C extends Collector, T> T search(Query query, CollectorManager<C, T> collectorManager)*
 
The good news is that IndexSearcher handles *CollectionTerminatedException* gracefully by ignoring it. We can either wrap TimeLimitingCollector and throw  *CollectionTerminatedException* when timeout happens or simply replace *TimeExceededException* with *CollectionTerminatedException*. In either way, we also need to maintain a flag that indicates if timeout occurred so that the user know it's a partial collection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] [Created] (LUCENE-8319) A Time-limiting collector that works with CollectorManagers

Michael Sokolov-4
Would it make sense to change TimeExceededException so it extends CollectionTerminatedException?

On Wed, May 16, 2018 at 4:29 PM, Tony Xu (JIRA) <[hidden email]> wrote:
Tony Xu created LUCENE-8319:
-------------------------------

             Summary: A Time-limiting collector that works with CollectorManagers
                 Key: LUCENE-8319
                 URL: https://issues.apache.org/jira/browse/LUCENE-8319
             Project: Lucene - Core
          Issue Type: Improvement
          Components: core/search
            Reporter: Tony Xu


Currently Lucene has *TimeLimitingCollector* to support time-bound collection and it will throw 
*TimeExceededException* if timeout happens. This only works nicely with the single-thread low-level API from the IndexSearcher. The method signature is --

*void search(List<LeafReaderContext> leaves, Weight weight, Collector collector)*

The intended use is to always enclose the searcher.search(query, collector) call with a try ... catch and handle the timeout exception. Unfortunately when working with a *CollectorManager* in the multi-thread search context, the *TimeExceededException* thrown during collecting one leaf slice will be re-thrown by *IndexSearcher* without calling *CollectorManager*'s reduce(), even if other slices are successfully collected. The signature
of the search api with *CollectorManager* is --

*<C extends Collector, T> T search(Query query, CollectorManager<C, T> collectorManager)*
 
The good news is that IndexSearcher handles *CollectionTerminatedException* gracefully by ignoring it. We can either wrap TimeLimitingCollector and throw  *CollectionTerminatedException* when timeout happens or simply replace *TimeExceededException* with *CollectionTerminatedException*. In either way, we also need to maintain a flag that indicates if timeout occurred so that the user know it's a partial collection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]