Remote Searcher performance and Document retrieval

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Remote Searcher performance and Document retrieval

sashaman
I have a question about the performance of the following Lucene setup:

I have a MultiSearcher that contains 3 RemoteSearchables that point to IndexSearchables on 3 different boxes, and each index contains about 20,000 docs. When an end user runs a search, results are displayed with 20 hits per page, and each hits shows about 10 fields from the document. My issue is that the remote protocol used to retrieve the Hits appears to be very chatty, requiring one remote network call for each Hit retrieved (that is, each call to Hits.doc(n) ends up making an individual remote call). Thus, to display a result page to an end user, there are 3 remote search() calls (one for each index) and 20 remote doc() calls. When I added some performance monitoring logging to my code, I found that for each results page retrieving the Documents took more than twice as long as the actual search() call.

It seems like things would be much more performant if instead there were a method like "Document[] Hits.docs(int[] docIds)" (and a corresponding method on the Searcher "Document[] Searcher.docs(int[] docIds)") that could return a bulk set of Documents at once. This way I would only need to make one remote call to each index for each page of results.

Can anyone comment on this performance issue? Am I doing something wrong, or there is some other way I can retrieve Documents faster when using a Remote index?

Thanks,
Alex
Reply | Threaded
Open this post in threaded view
|

Re: Remote Searcher performance and Document retrieval

Daniel Naber-5
On Monday 08 January 2007 23:08, sashaman wrote:

> Can anyone comment on this performance issue?

Have you compared to a local index? It's not uncommon for several doc()
calls to take more time than searching, as doc() requires a lot I/O, even
locally.

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Remote Searcher performance and Document retrieval

sashaman
No, I haven't compared it to a local index, but when working on other projects in the past I've always seen that "chatty" remote access can really hurt performance, and that batching up multiple calls into a single remote access saves a lot of time.

Alex