Efficiently paginating results.

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Efficiently paginating results.

Jean Sini-2
Hi,

Our application presents search results in a paginated form.

We were unable to find Searcher methods that would return, say, 'n'
(typically, 10) hits after a start offset 'k'.

So we're currently using the Hits collection returned by Searcher.search,
and using its Hits.doc(i) method to get the ith hit, with i between k and
k+n. Is that the most efficient way to do that? Is there a better way (e.g.
some form of Filter on the query itself)?

Thanks,

Jean

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

Karl Wettin-3

27 apr 2006 kl. 20.44 skrev Jean Sini:

> Our application presents search results in a paginated form.
>
> We were unable to find Searcher methods that would return, say, 'n'
> (typically, 10) hits after a start offset 'k'.
>
> So we're currently using the Hits collection returned by  
> Searcher.search,
> and using its Hits.doc(i) method to get the ith hit, with i between  
> k and
> k+n. Is that the most efficient way to do that? Is there a better  
> way (e.g.
> some form of Filter on the query itself)?

You probably want to do it just the way you do.

But cache the Hits somehow. Perhaps in a session, perhaps globally  
in  /your/ searcher. Perhaps the session points at the global cache  
so it doesn't change within a session when you flush the cache on  
index update.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

Yonik Seeley
In reply to this post by Jean Sini-2
On 4/27/06, Jean Sini <[hidden email]> wrote:
> We were unable to find Searcher methods that would return, say, 'n'
> (typically, 10) hits after a start offset 'k'.

Yes, that's because to find results k through k+n, Lucene must first
find results 0 through k+n.

> So we're currently using the Hits collection returned by Searcher.search,
> and using its Hits.doc(i) method to get the ith hit, with i between k and
> k+n. Is that the most efficient way to do that?

You can use the lower level search functions that return TopDocs or
TopFieldDocs to get more explicit control over "n", when you search
for the top "n" documents.

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Efficiently paginating results.

Jean Sini-2
In reply to this post by Karl Wettin-3
Thanks.
One of the trade-offs we are considering, along the lines of what you
mentioned, has to do with whether or not to cache the Hits. The benefit
being that we'd avoid re-running the search if requests for hits past the
first page do come in, the cost being that we'd have to keep around all the
hits, that are unlikely to be requested.
So far, we have been leaning toward a stateless approach: we are re-running
the search with no prior knowledge of whether or not it ran before, and
serving up the hits at the requested offset. I could see caching a fixed
list of search results, and only re-running the search if it has been
evicted from that cache.
Jean

-----Original Message-----
From: karl wettin [mailto:[hidden email]]
Sent: Thursday, April 27, 2006 12:09 PM
To: [hidden email]
Subject: Re: Efficiently paginating results.


27 apr 2006 kl. 20.44 skrev Jean Sini:

> Our application presents search results in a paginated form.
>
> We were unable to find Searcher methods that would return, say, 'n'
> (typically, 10) hits after a start offset 'k'.
>
> So we're currently using the Hits collection returned by  
> Searcher.search,
> and using its Hits.doc(i) method to get the ith hit, with i between  
> k and
> k+n. Is that the most efficient way to do that? Is there a better  
> way (e.g.
> some form of Filter on the query itself)?

You probably want to do it just the way you do.

But cache the Hits somehow. Perhaps in a session, perhaps globally  
in  /your/ searcher. Perhaps the session points at the global cache  
so it doesn't change within a session when you flush the cache on  
index update.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

Karl Wettin-3

27 apr 2006 kl. 23.39 skrev Jean Sini:

>> 27 apr 2006 kl. 20.44 skrev Jean Sini:
>>> Our application presents search results in a paginated form.
>>>
>>> We were unable to find Searcher methods that would return, say, 'n'
>>> (typically, 10) hits after a start offset 'k'.
>>>
>>> So we're currently using the Hits collection returned by
>>> Searcher.search and using its Hits.doc(i) method to get the hit
>>
>> You probably want to do it just the way you do.
>>
>> But cache the Hits somehow.

> One of the trade-offs we are considering, along the lines of what you
> mentioned, has to do with whether or not to cache the Hits. The  
> benefit
> being that we'd avoid re-running the search if requests for hits  
> past the
> first page do come in, the cost being that we'd have to keep around  
> all the
> hits, that are unlikely to be requested.

Have you done any tests to compare? In one case I keep about 20 000  
Hits (10 minutes) in cache until it is flushed. About 60% of the  
results are fetched from the cache. The processor goes from 10% to  
80% idle. It consumes virtually no memory.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

marc_dauncey
In reply to this post by Karl Wettin-3
I read somewhere recently (maybe even on this list) a
recommendation to requery each time for successive
pages as this avoids some of the complexity involved
in session management. Whats peoples view of this?

Marc


--- karl wettin <[hidden email]> wrote:

>
> 27 apr 2006 kl. 20.44 skrev Jean Sini:
> > Our application presents search results in a
> paginated form.
> >
> > We were unable to find Searcher methods that would
> return, say, 'n'
> > (typically, 10) hits after a start offset 'k'.
> >
> > So we're currently using the Hits collection
> returned by  
> > Searcher.search,
> > and using its Hits.doc(i) method to get the ith
> hit, with i between  
> > k and
> > k+n. Is that the most efficient way to do that? Is
> there a better  
> > way (e.g.
> > some form of Filter on the query itself)?
>
> You probably want to do it just the way you do.
>
> But cache the Hits somehow. Perhaps in a session,
> perhaps globally  
> in  /your/ searcher. Perhaps the session points at
> the global cache  
> so it doesn't change within a session when you flush
> the cache on  
> index update.
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>



               
___________________________________________________________
Yahoo! Photos – NEW, now offering a quality print service from just 7p a photo http://uk.photos.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

Hannes Carl Meyer
Hi Marc,

I'm using this method for a web-application. I'm storing only the
current viewable set of documents in the session and re-query if the user
scrolls to the next page. This method is pretty fast and has a minimal
session- and processing-footprint. But, if your index is changed during
scrolling and that affects the query and you're re-querying, you will
maybe get confused about new docs :-)

Hannes

Marc Dauncey schrieb:

> I read somewhere recently (maybe even on this list) a
> recommendation to requery each time for successive
> pages as this avoids some of the complexity involved
> in session management. Whats peoples view of this?
>
> Marc
>
>
> --- karl wettin <[hidden email]> wrote:
>
>  
>> 27 apr 2006 kl. 20.44 skrev Jean Sini:
>>    
>>> Our application presents search results in a
>>>      
>> paginated form.
>>    
>>> We were unable to find Searcher methods that would
>>>      
>> return, say, 'n'
>>    
>>> (typically, 10) hits after a start offset 'k'.
>>>
>>> So we're currently using the Hits collection
>>>      
>> returned by  
>>    
>>> Searcher.search,
>>> and using its Hits.doc(i) method to get the ith
>>>      
>> hit, with i between  
>>    
>>> k and
>>> k+n. Is that the most efficient way to do that? Is
>>>      
>> there a better  
>>    
>>> way (e.g.
>>> some form of Filter on the query itself)?
>>>      
>> You probably want to do it just the way you do.
>>
>> But cache the Hits somehow. Perhaps in a session,
>> perhaps globally  
>> in  /your/ searcher. Perhaps the session points at
>> the global cache  
>> so it doesn't change within a session when you flush
>> the cache on  
>> index update.
>>
>>
>>    
> ---------------------------------------------------------------------
>  
>> To unsubscribe, e-mail:
>> [hidden email]
>> For additional commands, e-mail:
>> [hidden email]
>>
>>
>>    
>
>
>
>
> ___________________________________________________________
> Yahoo! Photos – NEW, now offering a quality print service from just 7p a photo http://uk.photos.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

marc_dauncey
Yes, I was thinking about index updates.

Getting a different result set when you go back to a
previous page might be an issue -   could always cache
each page as its opened rather than the entire result
set.  



--- Hannes Carl Meyer <[hidden email]> wrote:

> Hi Marc,
>
> I'm using this method for a web-application. I'm
> storing only the
> current viewable set of documents in the session and
> re-query if the user
> scrolls to the next page. This method is pretty fast
> and has a minimal
> session- and processing-footprint. But, if your
> index is changed during
> scrolling and that affects the query and you're
> re-querying, you will
> maybe get confused about new docs :-)
>
> Hannes
>
> Marc Dauncey schrieb:
> > I read somewhere recently (maybe even on this
> list) a
> > recommendation to requery each time for successive
> > pages as this avoids some of the complexity
> involved
> > in session management. Whats peoples view of this?
> >
> > Marc
> >
> >
> > --- karl wettin <[hidden email]> wrote:
> >
> >  
> >> 27 apr 2006 kl. 20.44 skrev Jean Sini:
> >>    
> >>> Our application presents search results in a
> >>>      
> >> paginated form.
> >>    
> >>> We were unable to find Searcher methods that
> would
> >>>      
> >> return, say, 'n'
> >>    
> >>> (typically, 10) hits after a start offset 'k'.
> >>>
> >>> So we're currently using the Hits collection
> >>>      
> >> returned by  
> >>    
> >>> Searcher.search,
> >>> and using its Hits.doc(i) method to get the ith
> >>>      
> >> hit, with i between  
> >>    
> >>> k and
> >>> k+n. Is that the most efficient way to do that?
> Is
> >>>      
> >> there a better  
> >>    
> >>> way (e.g.
> >>> some form of Filter on the query itself)?
> >>>      
> >> You probably want to do it just the way you do.
> >>
> >> But cache the Hits somehow. Perhaps in a session,
> >> perhaps globally  
> >> in  /your/ searcher. Perhaps the session points
> at
> >> the global cache  
> >> so it doesn't change within a session when you
> flush
> >> the cache on  
> >> index update.
> >>
> >>
> >>    
> >
>
---------------------------------------------------------------------

> >  
> >> To unsubscribe, e-mail:
> >> [hidden email]
> >> For additional commands, e-mail:
> >> [hidden email]
> >>
> >>
> >>    
> >
> >
> >
> >
> >
>
___________________________________________________________
>
> > Yahoo! Photos – NEW, now offering a quality print
> service from just 7p a photo
> http://uk.photos.yahoo.com
> >
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> [hidden email]
> > For additional commands, e-mail:
> [hidden email]
> >
> >  
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>


Send instant messages to your online friends http://uk.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

Hannes Carl Meyer
p.s. To avoid that issue you could store the result-sets document ids in
the session.

Marc Dauncey schrieb:

> Yes, I was thinking about index updates.
>
> Getting a different result set when you go back to a
> previous page might be an issue -   could always cache
> each page as its opened rather than the entire result
> set.  
>
>
>
> --- Hannes Carl Meyer <[hidden email]> wrote:
>
>  
>> Hi Marc,
>>
>> I'm using this method for a web-application. I'm
>> storing only the
>> current viewable set of documents in the session and
>> re-query if the user
>> scrolls to the next page. This method is pretty fast
>> and has a minimal
>> session- and processing-footprint. But, if your
>> index is changed during
>> scrolling and that affects the query and you're
>> re-querying, you will
>> maybe get confused about new docs :-)
>>
>> Hannes
>>
>> Marc Dauncey schrieb:
>>    
>>> I read somewhere recently (maybe even on this
>>>      
>> list) a
>>    
>>> recommendation to requery each time for successive
>>> pages as this avoids some of the complexity
>>>      
>> involved
>>    
>>> in session management. Whats peoples view of this?
>>>
>>> Marc
>>>
>>>
>>> --- karl wettin <[hidden email]> wrote:
>>>
>>>  
>>>      
>>>> 27 apr 2006 kl. 20.44 skrev Jean Sini:
>>>>    
>>>>        
>>>>> Our application presents search results in a
>>>>>      
>>>>>          
>>>> paginated form.
>>>>    
>>>>        
>>>>> We were unable to find Searcher methods that
>>>>>          
>> would
>>    
>>>>>      
>>>>>          
>>>> return, say, 'n'
>>>>    
>>>>        
>>>>> (typically, 10) hits after a start offset 'k'.
>>>>>
>>>>> So we're currently using the Hits collection
>>>>>      
>>>>>          
>>>> returned by  
>>>>    
>>>>        
>>>>> Searcher.search,
>>>>> and using its Hits.doc(i) method to get the ith
>>>>>      
>>>>>          
>>>> hit, with i between  
>>>>    
>>>>        
>>>>> k and
>>>>> k+n. Is that the most efficient way to do that?
>>>>>          
>> Is
>>    
>>>>>      
>>>>>          
>>>> there a better  
>>>>    
>>>>        
>>>>> way (e.g.
>>>>> some form of Filter on the query itself)?
>>>>>      
>>>>>          
>>>> You probably want to do it just the way you do.
>>>>
>>>> But cache the Hits somehow. Perhaps in a session,
>>>> perhaps globally  
>>>> in  /your/ searcher. Perhaps the session points
>>>>        
>> at
>>    
>>>> the global cache  
>>>> so it doesn't change within a session when you
>>>>        
>> flush
>>    
>>>> the cache on  
>>>> index update.
>>>>
>>>>
>>>>    
>>>>        
> ---------------------------------------------------------------------
>  
>>>  
>>>      
>>>> To unsubscribe, e-mail:
>>>> [hidden email]
>>>> For additional commands, e-mail:
>>>> [hidden email]
>>>>
>>>>
>>>>    
>>>>        
>>>
>>>
>>>
>>>      
> ___________________________________________________________
>  
>>> Yahoo! Photos – NEW, now offering a quality print
>>>      
>> service from just 7p a photo
>> http://uk.photos.yahoo.com
>>    
>>>      
> ---------------------------------------------------------------------
>  
>>> To unsubscribe, e-mail:
>>>      
>> [hidden email]
>>    
>>> For additional commands, e-mail:
>>>      
>> [hidden email]
>>    
>>>  
>>>      
>>
>>    
> ---------------------------------------------------------------------
>  
>> To unsubscribe, e-mail:
>> [hidden email]
>> For additional commands, e-mail:
>> [hidden email]
>>
>>
>>    
>
>
> Send instant messages to your online friends http://uk.messenger.yahoo.com 
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Efficiently paginating results.

Volodymyr Bychkoviak
In reply to this post by marc_dauncey
I'm caching hits by query. When accessing more documents Lucene
automatically re-quering index to retrieve more document.
When index changes then I reopen IndexReader and clear cache.

Marc Dauncey wrote:

> I read somewhere recently (maybe even on this list) a
> recommendation to requery each time for successive
> pages as this avoids some of the complexity involved
> in session management. Whats peoples view of this?
>
> Marc
>
>
> --- karl wettin <[hidden email]> wrote:
>
>  
>> 27 apr 2006 kl. 20.44 skrev Jean Sini:
>>    
>>> Our application presents search results in a
>>>      
>> paginated form.
>>    
>>> We were unable to find Searcher methods that would
>>>      
>> return, say, 'n'
>>    
>>> (typically, 10) hits after a start offset 'k'.
>>>
>>> So we're currently using the Hits collection
>>>      
>> returned by  
>>    
>>> Searcher.search,
>>> and using its Hits.doc(i) method to get the ith
>>>      
>> hit, with i between  
>>    
>>> k and
>>> k+n. Is that the most efficient way to do that? Is
>>>      
>> there a better  
>>    
>>> way (e.g.
>>> some form of Filter on the query itself)?
>>>      
>> You probably want to do it just the way you do.
>>
>> But cache the Hits somehow. Perhaps in a session,
>> perhaps globally  
>> in  /your/ searcher. Perhaps the session points at
>> the global cache  
>> so it doesn't change within a session when you flush
>> the cache on  
>> index update.
>>
>>
>>    
> ---------------------------------------------------------------------
>  
>> To unsubscribe, e-mail:
>> [hidden email]
>> For additional commands, e-mail:
>> [hidden email]
>>
>>
>>    
>
>
>
>
> ___________________________________________________________
> Yahoo! Photos -- NEW, now offering a quality print service from just 7p a photo http://uk.photos.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>  

--
regards,
Volodymyr Bychkoviak

Reply | Threaded
Open this post in threaded view
|

RE: Efficiently paginating results.

Kinnar Kumar Sen, Noida
In reply to this post by Jean Sini-2
Hi Marc

Can you give some statistics about the amount of data you are indexing ?
Do you not think requering for pagination will increase the time taken
for bringing the hits. Rather than bringing the entire hits once in the
memory then displaying it as and when the user is clicking on the next
button. It would be very kind of you if you give me a detailed view
about this..

Regards,

Kinnar


-----Original Message-----
From: Marc Dauncey [mailto:[hidden email]]
Sent: Friday, April 28, 2006 8:25 PM
To: [hidden email]
Subject: Re: Efficiently paginating results.

Yes, I was thinking about index updates.

Getting a different result set when you go back to a
previous page might be an issue -   could always cache
each page as its opened rather than the entire result
set.  



--- Hannes Carl Meyer <[hidden email]> wrote:

> Hi Marc,
>
> I'm using this method for a web-application. I'm
> storing only the
> current viewable set of documents in the session and
> re-query if the user
> scrolls to the next page. This method is pretty fast
> and has a minimal
> session- and processing-footprint. But, if your
> index is changed during
> scrolling and that affects the query and you're
> re-querying, you will
> maybe get confused about new docs :-)
>
> Hannes
>
> Marc Dauncey schrieb:
> > I read somewhere recently (maybe even on this
> list) a
> > recommendation to requery each time for successive
> > pages as this avoids some of the complexity
> involved
> > in session management. Whats peoples view of this?
> >
> > Marc
> >
> >
> > --- karl wettin <[hidden email]> wrote:
> >
> >  
> >> 27 apr 2006 kl. 20.44 skrev Jean Sini:
> >>    
> >>> Our application presents search results in a
> >>>      
> >> paginated form.
> >>    
> >>> We were unable to find Searcher methods that
> would
> >>>      
> >> return, say, 'n'
> >>    
> >>> (typically, 10) hits after a start offset 'k'.
> >>>
> >>> So we're currently using the Hits collection
> >>>      
> >> returned by  
> >>    
> >>> Searcher.search,
> >>> and using its Hits.doc(i) method to get the ith
> >>>      
> >> hit, with i between  
> >>    
> >>> k and
> >>> k+n. Is that the most efficient way to do that?
> Is
> >>>      
> >> there a better  
> >>    
> >>> way (e.g.
> >>> some form of Filter on the query itself)?
> >>>      
> >> You probably want to do it just the way you do.
> >>
> >> But cache the Hits somehow. Perhaps in a session,
> >> perhaps globally  
> >> in  /your/ searcher. Perhaps the session points
> at
> >> the global cache  
> >> so it doesn't change within a session when you
> flush
> >> the cache on  
> >> index update.
> >>
> >>
> >>    
> >
>
---------------------------------------------------------------------

> >  
> >> To unsubscribe, e-mail:
> >> [hidden email]
> >> For additional commands, e-mail:
> >> [hidden email]
> >>
> >>
> >>    
> >
> >
> >
> >
> >
>
___________________________________________________________
>
> > Yahoo! Photos - NEW, now offering a quality print
> service from just 7p a photo
> http://uk.photos.yahoo.com
> >
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> [hidden email]
> > For additional commands, e-mail:
> [hidden email]
> >
> >  
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>


Send instant messages to your online friends
http://uk.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Efficiently paginating results.

marc_dauncey
Hi Kinnar,

Well, I have quite a few indexes, some of which get
updated infrequently with large loads (quartley) and
then some indexes which will have approx 2000
additions a day.

Originally I planned to store the results on the
session - but I have to design for growth, both in
users and in data - I feel that eventually too much
memory will be taken up by this (which can be
mitigated by having multiple search servers) and I'll
end up timing out the session earlier than is useful.

I'm sure I read something in LIA the other day where
Eric recommended requerying, but I don't yet have any
fixed ideas about which way to go. What do you think?

Marc

--- "Kinnar Kumar Sen, Noida" <[hidden email]>
wrote:

> Hi Marc
>
> Can you give some statistics about the amount of
> data you are indexing ?
> Do you not think requering for pagination will
> increase the time taken
> for bringing the hits. Rather than bringing the
> entire hits once in the
> memory then displaying it as and when the user is
> clicking on the next
> button. It would be very kind of you if you give me
> a detailed view
> about this..
>
> Regards,
>
> Kinnar
>
>
> -----Original Message-----
> From: Marc Dauncey [mailto:[hidden email]]
> Sent: Friday, April 28, 2006 8:25 PM
> To: [hidden email]
> Subject: Re: Efficiently paginating results.
>
> Yes, I was thinking about index updates.
>
> Getting a different result set when you go back to a
> previous page might be an issue -   could always
> cache
> each page as its opened rather than the entire
> result
> set.  
>
>
>
> --- Hannes Carl Meyer <[hidden email]> wrote:
>
> > Hi Marc,
> >
> > I'm using this method for a web-application. I'm
> > storing only the
> > current viewable set of documents in the session
> and
> > re-query if the user
> > scrolls to the next page. This method is pretty
> fast
> > and has a minimal
> > session- and processing-footprint. But, if your
> > index is changed during
> > scrolling and that affects the query and you're
> > re-querying, you will
> > maybe get confused about new docs :-)
> >
> > Hannes
> >
> > Marc Dauncey schrieb:
> > > I read somewhere recently (maybe even on this
> > list) a
> > > recommendation to requery each time for
> successive
> > > pages as this avoids some of the complexity
> > involved
> > > in session management. Whats peoples view of
> this?
> > >
> > > Marc
> > >
> > >
> > > --- karl wettin <[hidden email]> wrote:
> > >
> > >  
> > >> 27 apr 2006 kl. 20.44 skrev Jean Sini:
> > >>    
> > >>> Our application presents search results in a
> > >>>      
> > >> paginated form.
> > >>    
> > >>> We were unable to find Searcher methods that
> > would
> > >>>      
> > >> return, say, 'n'
> > >>    
> > >>> (typically, 10) hits after a start offset 'k'.
> > >>>
> > >>> So we're currently using the Hits collection
> > >>>      
> > >> returned by  
> > >>    
> > >>> Searcher.search,
> > >>> and using its Hits.doc(i) method to get the
> ith
> > >>>      
> > >> hit, with i between  
> > >>    
> > >>> k and
> > >>> k+n. Is that the most efficient way to do
> that?
> > Is
> > >>>      
> > >> there a better  
> > >>    
> > >>> way (e.g.
> > >>> some form of Filter on the query itself)?
> > >>>      
> > >> You probably want to do it just the way you do.
> > >>
> > >> But cache the Hits somehow. Perhaps in a
> session,
> > >> perhaps globally  
> > >> in  /your/ searcher. Perhaps the session points
> > at
> > >> the global cache  
> > >> so it doesn't change within a session when you
> > flush
> > >> the cache on  
> > >> index update.
> > >>
> > >>
> > >>    
> > >
> >
>
---------------------------------------------------------------------

> > >  
> > >> To unsubscribe, e-mail:
> > >> [hidden email]
> > >> For additional commands, e-mail:
> > >> [hidden email]
> > >>
> > >>
> > >>    
> > >
> > >
> > >
> > >
> > >
> >
>
___________________________________________________________
> >
> > > Yahoo! Photos - NEW, now offering a quality
> print
> > service from just 7p a photo
> > http://uk.photos.yahoo.com
> > >
> > >
> >
>
---------------------------------------------------------------------

> > > To unsubscribe, e-mail:
> > [hidden email]
> > > For additional commands, e-mail:
> > [hidden email]
> > >
> > >  
> >
> >
> >
>
---------------------------------------------------------------------

> > To unsubscribe, e-mail:
> > [hidden email]
> > For additional commands, e-mail:
> > [hidden email]
> >
> >
>
>
> Send instant messages to your online friends
> http://uk.messenger.yahoo.com 
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>



               
___________________________________________________________
NEW Yahoo! Cars - sell your car and browse thousands of new and used cars online! http://uk.cars.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Efficiently paginating results.

Kinnar Kumar Sen, Noida
In reply to this post by Jean Sini-2
Hi Marc
   I have basically gone through the book Lucene in Action  where it
suggest requerying would be better, but I believe it depends on the kind
of application you have. In my case I need to rank the hits according to
some other parameters so I need the total hits at a time then rank it
accordingly. Then while paginating displaying the results in batches.
   Can you suggest me some other way to do the same? In my case the
ranking is an absolute must and I can't put the ranking parameters
inside lucene index along with each data.

Regards
Kinnar


-----Original Message-----
From: Marc Dauncey [mailto:[hidden email]]
Sent: Saturday, April 29, 2006 12:08 AM
To: [hidden email]
Subject: RE: Efficiently paginating results.

Hi Kinnar,

Well, I have quite a few indexes, some of which get
updated infrequently with large loads (quartley) and
then some indexes which will have approx 2000
additions a day.

Originally I planned to store the results on the
session - but I have to design for growth, both in
users and in data - I feel that eventually too much
memory will be taken up by this (which can be
mitigated by having multiple search servers) and I'll
end up timing out the session earlier than is useful.

I'm sure I read something in LIA the other day where
Eric recommended requerying, but I don't yet have any
fixed ideas about which way to go. What do you think?

Marc

--- "Kinnar Kumar Sen, Noida" <[hidden email]>
wrote:

> Hi Marc
>
> Can you give some statistics about the amount of
> data you are indexing ?
> Do you not think requering for pagination will
> increase the time taken
> for bringing the hits. Rather than bringing the
> entire hits once in the
> memory then displaying it as and when the user is
> clicking on the next
> button. It would be very kind of you if you give me
> a detailed view
> about this..
>
> Regards,
>
> Kinnar
>
>
> -----Original Message-----
> From: Marc Dauncey [mailto:[hidden email]]
> Sent: Friday, April 28, 2006 8:25 PM
> To: [hidden email]
> Subject: Re: Efficiently paginating results.
>
> Yes, I was thinking about index updates.
>
> Getting a different result set when you go back to a
> previous page might be an issue -   could always
> cache
> each page as its opened rather than the entire
> result
> set.  
>
>
>
> --- Hannes Carl Meyer <[hidden email]> wrote:
>
> > Hi Marc,
> >
> > I'm using this method for a web-application. I'm
> > storing only the
> > current viewable set of documents in the session
> and
> > re-query if the user
> > scrolls to the next page. This method is pretty
> fast
> > and has a minimal
> > session- and processing-footprint. But, if your
> > index is changed during
> > scrolling and that affects the query and you're
> > re-querying, you will
> > maybe get confused about new docs :-)
> >
> > Hannes
> >
> > Marc Dauncey schrieb:
> > > I read somewhere recently (maybe even on this
> > list) a
> > > recommendation to requery each time for
> successive
> > > pages as this avoids some of the complexity
> > involved
> > > in session management. Whats peoples view of
> this?
> > >
> > > Marc
> > >
> > >
> > > --- karl wettin <[hidden email]> wrote:
> > >
> > >  
> > >> 27 apr 2006 kl. 20.44 skrev Jean Sini:
> > >>    
> > >>> Our application presents search results in a
> > >>>      
> > >> paginated form.
> > >>    
> > >>> We were unable to find Searcher methods that
> > would
> > >>>      
> > >> return, say, 'n'
> > >>    
> > >>> (typically, 10) hits after a start offset 'k'.
> > >>>
> > >>> So we're currently using the Hits collection
> > >>>      
> > >> returned by  
> > >>    
> > >>> Searcher.search,
> > >>> and using its Hits.doc(i) method to get the
> ith
> > >>>      
> > >> hit, with i between  
> > >>    
> > >>> k and
> > >>> k+n. Is that the most efficient way to do
> that?
> > Is
> > >>>      
> > >> there a better  
> > >>    
> > >>> way (e.g.
> > >>> some form of Filter on the query itself)?
> > >>>      
> > >> You probably want to do it just the way you do.
> > >>
> > >> But cache the Hits somehow. Perhaps in a
> session,
> > >> perhaps globally  
> > >> in  /your/ searcher. Perhaps the session points
> > at
> > >> the global cache  
> > >> so it doesn't change within a session when you
> > flush
> > >> the cache on  
> > >> index update.
> > >>
> > >>
> > >>    
> > >
> >
>
---------------------------------------------------------------------

> > >  
> > >> To unsubscribe, e-mail:
> > >> [hidden email]
> > >> For additional commands, e-mail:
> > >> [hidden email]
> > >>
> > >>
> > >>    
> > >
> > >
> > >
> > >
> > >
> >
>
___________________________________________________________
> >
> > > Yahoo! Photos - NEW, now offering a quality
> print
> > service from just 7p a photo
> > http://uk.photos.yahoo.com
> > >
> > >
> >
>
---------------------------------------------------------------------

> > > To unsubscribe, e-mail:
> > [hidden email]
> > > For additional commands, e-mail:
> > [hidden email]
> > >
> > >  
> >
> >
> >
>
---------------------------------------------------------------------

> > To unsubscribe, e-mail:
> > [hidden email]
> > For additional commands, e-mail:
> > [hidden email]
> >
> >
>
>
> Send instant messages to your online friends
> http://uk.messenger.yahoo.com 
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [hidden email]
> For additional commands, e-mail:
> [hidden email]
>
>



               
___________________________________________________________
NEW Yahoo! Cars - sell your car and browse thousands of new and used
cars online! http://uk.cars.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]