Query on searchAfter API usage in IndexSearcher

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Query on searchAfter API usage in IndexSearcher

tomanishgupta18@gmail.com
 Hi Team,

I am new to Lucene and I am trying to use Lucene for text search in my
project to achieve better results in terms of query performance.

Initially I was facing lot of GC issues while using lucene as I was using
search API and passing all the documents count. As my data size is around 4
billion the number of documents created by Lucene were huge. Internally
search API uses TopScoreDocCollector which internally creates a
PriorityQueue of given documents count thus causing lot of GC.

*To avoid this problem I am trying to query using a pagination way wherein
I am query only 10 documents at a time and after that I am using
seacrhAfter API to query further passing the lastScoreDoc from previous
result. This has resolved the GC problem but the query time has increased
by a huge margin from 3 sec to 600 sec.*

*When I debugged I found that even though I use the searchAfter API, it is
not avoiding the IO and every time it is reading the data from disk again.
It is only skipping the results filled in previous search. Is my
understanding correct?. If yes please let me know if there is a better way
to query the results in incremental order so as to avoid GC and with
minimal impact on query performance.*

Regards
Manish Gupta
Reply | Threaded
Open this post in threaded view
|

Re: Query on searchAfter API usage in IndexSearcher

tomanishgupta18@gmail.com
Hi Lucene Team,

Can you please reply to my query. Its a urgent issue and we need to resolve
it at the earliest.

Lucene Version used is 6.3.0 but even tried with the latest version 7.3.0.

Regards
Manish Gupta



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Query on searchAfter API usage in IndexSearcher

Jacky Li
In reply to this post by tomanishgupta18@gmail.com
I have encountered the same problem, I wonder if anyone know the solution?

Regards,
Jacky



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Query on searchAfter API usage in IndexSearcher

Bryan Bende
Are you specifying a sort clause on your query?

I'm not totally sure, but I think having a sort clause might be a
requirement for efficient deep paging.

I know Solr's cursorMark feature uses the searchAfter API, and a
cursorMark is essentially the sort values of the last document from
the previous result:

https://github.com/apache/lucene-solr/blob/e30264b31400a147507aabd121b1152020b8aa6d/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1524-L1525
https://lucene.apache.org/solr/guide/7_3/pagination-of-results.html


On Wed, May 9, 2018 at 4:56 AM, Jacky Li <[hidden email]> wrote:

> I have encountered the same problem, I wonder if anyone know the solution?
>
> Regards,
> Jacky
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]