Distibuted search

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Distibuted search

Isaac Hebsh
Does adding replicas (on additional servers) help to improve search
performance?

It is known that each query goes to all the shards. It's clear that if we
have massive load, then multiple cores serving the same shard are very
useful.

But what happens if I'll never have concurrent queries (one query is in the
system at any time), but I want these single queries to return faster. Is a
bigger replication factor will contribute?

Especially, Will a complicated query (with a large amount of queried
fields) go to multiple cores *of the same shard*? (E.g. core1 searching for
term1 in field1, and core2 searching for term 2 in field2)

And what about a query on a single field, which contains a lot of terms?

Thanks in advance..
Reply | Threaded
Open this post in threaded view
|

Re: Distibuted search

Mingfeng Yang
In your case, since there is no co-current queries, adding replicas won't
help much on improving the response speed.  However, break your index into
a few shards do help increase query performance. I recently break an index
with 30 million documents (30G) into 4 shards, and the boost is pretty
impressive (roughly 2-5x faster for a complicated query)

Ming


On Mon, Jan 28, 2013 at 10:54 AM, Isaac Hebsh <[hidden email]> wrote:

> Does adding replicas (on additional servers) help to improve search
> performance?
>
> It is known that each query goes to all the shards. It's clear that if we
> have massive load, then multiple cores serving the same shard are very
> useful.
>
> But what happens if I'll never have concurrent queries (one query is in the
> system at any time), but I want these single queries to return faster. Is a
> bigger replication factor will contribute?
>
> Especially, Will a complicated query (with a large amount of queried
> fields) go to multiple cores *of the same shard*? (E.g. core1 searching for
> term1 in field1, and core2 searching for term 2 in field2)
>
> And what about a query on a single field, which contains a lot of terms?
>
> Thanks in advance..
>
Reply | Threaded
Open this post in threaded view
|

Re: Distibuted search

Isaac Hebsh
Well, My index is already broken to 16 shards...
The behaviour I supposed - It absolutely doesn't happen... Right?
Does it make sense somehow as an improvement request?
Technically, Can multiple Lucene responses be intersected this way?


On Mon, Jan 28, 2013 at 9:27 PM, Mingfeng Yang <[hidden email]>wrote:

> In your case, since there is no co-current queries, adding replicas won't
> help much on improving the response speed.  However, break your index into
> a few shards do help increase query performance. I recently break an index
> with 30 million documents (30G) into 4 shards, and the boost is pretty
> impressive (roughly 2-5x faster for a complicated query)
>
> Ming
>
>
> On Mon, Jan 28, 2013 at 10:54 AM, Isaac Hebsh <[hidden email]>
> wrote:
>
> > Does adding replicas (on additional servers) help to improve search
> > performance?
> >
> > It is known that each query goes to all the shards. It's clear that if we
> > have massive load, then multiple cores serving the same shard are very
> > useful.
> >
> > But what happens if I'll never have concurrent queries (one query is in
> the
> > system at any time), but I want these single queries to return faster.
> Is a
> > bigger replication factor will contribute?
> >
> > Especially, Will a complicated query (with a large amount of queried
> > fields) go to multiple cores *of the same shard*? (E.g. core1 searching
> for
> > term1 in field1, and core2 searching for term 2 in field2)
> >
> > And what about a query on a single field, which contains a lot of terms?
> >
> > Thanks in advance..
> >
>