will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

马可阳
Let’s say I have a user info index and user id is the ‘primary key’. So when I do a userid term search, will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

If it is the latter one, will any plan to make it the former way?


马可阳

京东 【基础平台|中间件|JSF】
=====================================
服务构建 jsf.jd.com

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

Trejkaz
On Fri, Apr 21, 2017 at 1:09 PM, 马可阳 <[hidden email]> wrote:
> Let’s say I have a user info index and user id is the ‘primary key’. So when I do a userid term search,
> will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?
>
> If it is the latter one, will any plan to make it the former way?

Thoughts:

1) What happens if you limit the result count to 1 *and* turn scoring off?

2) If that didn't work, a custom collector which terminates on finding
a single hit sounds easy enough to write.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

Michael McCandless-2
In reply to this post by 马可阳
Lucene by default will search all segments, because it does not know that
your field is a primary key.

Trejkaz's suggestion to early-terminate should work well.  You could also
write custom code that uses TermsEnum on each segment.

Mike McCandless

http://blog.mikemccandless.com

On Thu, Apr 20, 2017 at 11:09 PM, 马可阳 <[hidden email]> wrote:

> Let’s say I have a user info index and user id is the ‘primary key’. So
> when I do a userid term search, will lucene traverse all segments to search
> a 'primary key'term or will it stop as soon as it get one?
>
> If it is the latter one, will any plan to make it the former way?
>
>
> 马可阳
>
> 京东 【基础平台|中间件|JSF】
> =====================================
> 服务构建 jsf.jd.com
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

Chris Hostetter-3
: Lucene by default will search all segments, because it does not know that
: your field is a primary key.
:
: Trejkaz's suggestion to early-terminate should work well.  You could also
: write custom code that uses TermsEnum on each segment.

Before you go too far down the rabit hole of writting any custom code,
make sure to do some experiements and actaully measure the performance of
a uniqueKey lookup using a simple needScores=false search ... the way
TermQuery works across each segments is very low cost for the segments
where the Term doesn't exist in any docs at all.


-Hoss
http://www.lucidworks.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: will lucene traverse all segments to search a 'primary key'term or will it stop as soon as it get one?

Ravikumar Govindarajan
>
> Let’s say I have a user info index and user id is the ‘primary key’. So
> when I do a userid term search, will lucene traverse all segments to search
> a 'primary key'term or will it stop as soon as it get one?


Lucene in general will search all segments for primary key. But in case you
want a little speed up, you can try Bloom Filtering Codec or Memory Codec
if you can afford it

--
Ravi

On Sat, Apr 22, 2017 at 12:25 AM, Chris Hostetter <[hidden email]>
wrote:

> : Lucene by default will search all segments, because it does not know that
> : your field is a primary key.
> :
> : Trejkaz's suggestion to early-terminate should work well.  You could also
> : write custom code that uses TermsEnum on each segment.
>
> Before you go too far down the rabit hole of writting any custom code,
> make sure to do some experiements and actaully measure the performance of
> a uniqueKey lookup using a simple needScores=false search ... the way
> TermQuery works across each segments is very low cost for the segments
> where the Term doesn't exist in any docs at all.
>
>
> -Hoss
> http://www.lucidworks.com/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Loading...