Returning only a small set of results

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Returning only a small set of results

Kainth, Sachin
Hi all,

A question about efficiency and the internal workings of the Hits class.
When we make a call to IndexSearcher's search method thus:

Hits hits = searcher.Search(query);

Do we actually, physically get back all the results of the query even if
there are 20 million results or for efficiency do we physically get back
the first few results and if we then go and use (traverse, display
whatever) these hits then we physically get back more and more as and
when we need them.  It just seems that if we called the search method
and millions of records came back that this would be highly inefficient.
Also, if we do physically get back everything then is there a way of
ensuring we only get back a few at a time?

Thanks

Sachin


This email and any attached files are confidential and copyright protected. If you are not the addressee, any dissemination of this communication is strictly prohibited. Unless otherwise expressly agreed in writing, nothing stated in this communication shall be legally binding.

The ultimate parent company of the Atkins Group is WS Atkins plc.  Registered in England No. 1885586.  Registered Office Woodcote Grove, Ashley Road, Epsom, Surrey KT18 5BW.

Consider the environment. Please don't print this e-mail unless you really need to.
Reply | Threaded
Open this post in threaded view
|

Re: Returning only a small set of results

Chris Hostetter-3
: A question about efficiency and the internal workings of the Hits class.
: When we make a call to IndexSearcher's search method thus:
:
: Hits hits = searcher.Search(query);
:
: Do we actually, physically get back all the results of the query even if
: there are 20 million results or for efficiency do we physically get back

the Hits class fetches back the first N result documents (where N is 100 i
think) and then it fetches more and more as needed if you ask for more.
generally speaking Hits works fine for simple pagination applications, but
if you are intented on walking deep down the list of ordred results, i
would avoid it.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: Returning only a small set of results

Kainth, Sachin
What can you use in place of Hits and how do they differ?

-----Original Message-----
From: Chris Hostetter [mailto:[hidden email]]
Sent: 21 February 2007 22:43
To: [hidden email]
Subject: Re: Returning only a small set of results

: A question about efficiency and the internal workings of the Hits
class.
: When we make a call to IndexSearcher's search method thus:
:
: Hits hits = searcher.Search(query);
:
: Do we actually, physically get back all the results of the query even
if
: there are 20 million results or for efficiency do we physically get
back

the Hits class fetches back the first N result documents (where N is 100
i
think) and then it fetches more and more as needed if you ask for more.
generally speaking Hits works fine for simple pagination applications,
but if you are intented on walking deep down the list of ordred results,
i would avoid it.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



This message has been scanned for viruses by MailControl - (see
http://bluepages.wsatkins.co.uk/?6875772)


This email and any attached files are confidential and copyright protected. If you are not the addressee, any dissemination of this communication is strictly prohibited. Unless otherwise expressly agreed in writing, nothing stated in this communication shall be legally binding.

The ultimate parent company of the Atkins Group is WS Atkins plc.  Registered in England No. 1885586.  Registered Office Woodcote Grove, Ashley Road, Epsom, Surrey KT18 5BW.

Consider the environment. Please don't print this e-mail unless you really need to.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Returning only a small set of results

Erick Erickson
See TopDocs, HitCollector, etc. You'll have to dig through the documentation
and try a few experiments to make sense of it all, one sentence explanations
aren't much help.

But think of Hits as a convenience class for getting the best-scoring 100
documents and use the other classes if you want to get *all* the documents.
Don't go to the other classes unless you start getting performance problems
with Hits. The main take-away from Hits is that it'll re-execute the query
every 100 documents you read from it or so, so the only time you care is
when you find yourself assembling large numbers of documents...

Erick

On 2/22/07, Kainth, Sachin <[hidden email]> wrote:

>
> What can you use in place of Hits and how do they differ?
>
> -----Original Message-----
> From: Chris Hostetter [mailto:[hidden email]]
> Sent: 21 February 2007 22:43
> To: [hidden email]
> Subject: Re: Returning only a small set of results
>
> : A question about efficiency and the internal workings of the Hits
> class.
> : When we make a call to IndexSearcher's search method thus:
> :
> : Hits hits = searcher.Search(query);
> :
> : Do we actually, physically get back all the results of the query even
> if
> : there are 20 million results or for efficiency do we physically get
> back
>
> the Hits class fetches back the first N result documents (where N is 100
> i
> think) and then it fetches more and more as needed if you ask for more.
> generally speaking Hits works fine for simple pagination applications,
> but if you are intented on walking deep down the list of ordred results,
> i would avoid it.
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>
> This message has been scanned for viruses by MailControl - (see
> http://bluepages.wsatkins.co.uk/?6875772)
>
>
> This email and any attached files are confidential and copyright
> protected. If you are not the addressee, any dissemination of this
> communication is strictly prohibited. Unless otherwise expressly agreed in
> writing, nothing stated in this communication shall be legally binding.
>
> The ultimate parent company of the Atkins Group is WS Atkins
> plc.  Registered in England No. 1885586.  Registered Office Woodcote Grove,
> Ashley Road, Epsom, Surrey KT18 5BW.
>
> Consider the environment. Please don't print this e-mail unless you really
> need to.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

RE: Returning only a small set of results

Kainth, Sachin
Thanks Erick you've helped a lot and so has everyone else.

-----Original Message-----
From: Erick Erickson [mailto:[hidden email]]
Sent: 22 February 2007 13:00
To: [hidden email]
Subject: Re: Returning only a small set of results

See TopDocs, HitCollector, etc. You'll have to dig through the
documentation and try a few experiments to make sense of it all, one
sentence explanations aren't much help.

But think of Hits as a convenience class for getting the best-scoring
100 documents and use the other classes if you want to get *all* the
documents.
Don't go to the other classes unless you start getting performance
problems with Hits. The main take-away from Hits is that it'll
re-execute the query every 100 documents you read from it or so, so the
only time you care is when you find yourself assembling large numbers of
documents...

Erick

On 2/22/07, Kainth, Sachin <[hidden email]> wrote:

>
> What can you use in place of Hits and how do they differ?
>
> -----Original Message-----
> From: Chris Hostetter [mailto:[hidden email]]
> Sent: 21 February 2007 22:43
> To: [hidden email]
> Subject: Re: Returning only a small set of results
>
> : A question about efficiency and the internal workings of the Hits
> class.
> : When we make a call to IndexSearcher's search method thus:
> :
> : Hits hits = searcher.Search(query);
> :
> : Do we actually, physically get back all the results of the query
> even if
> : there are 20 million results or for efficiency do we physically get
> back
>
> the Hits class fetches back the first N result documents (where N is
> 100 i
> think) and then it fetches more and more as needed if you ask for
more.
> generally speaking Hits works fine for simple pagination applications,

> but if you are intented on walking deep down the list of ordred
> results, i would avoid it.
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
>
> This message has been scanned for viruses by MailControl - (see
> http://bluepages.wsatkins.co.uk/?6875772)
>
>
> This email and any attached files are confidential and copyright
> protected. If you are not the addressee, any dissemination of this
> communication is strictly prohibited. Unless otherwise expressly
> agreed in writing, nothing stated in this communication shall be
legally binding.

>
> The ultimate parent company of the Atkins Group is WS Atkins plc.  
> Registered in England No. 1885586.  Registered Office Woodcote Grove,
> Ashley Road, Epsom, Surrey KT18 5BW.
>
> Consider the environment. Please don't print this e-mail unless you
> really need to.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]