Leader node on specific host machines?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Leader node on specific host machines?

Koen De Groote
Hello,

I'm looking for a way to configure my collections as such that the leader
nodes of specific collections never share the same host.

This as a way to prevent several large and/or heavy-usage collections on
the same machine.

Is this something I can set in solrconfig.xml? Or are there rules for this?

Kind regards,
Koen De Groote
Reply | Threaded
Open this post in threaded view
|

Re: Leader node on specific host machines?

Erick Erickson
There’s the preferredLeader property, see: https://lucene.apache.org/solr/guide/6_6/collections-api.html

That said, this was put in for situations where there were 100s of shards with replicas from many shards hosted on any given machine, so it was possible in that setup to have 100 or more leaders on a single node.

In the usual case, the leader role doesn’t do very much extra work, and the extra work is mostly distributing the incoming documents to the followers during indexing (mostly I/O). During query time, the leader has no extra duties at all. So if “heavy use” means heavy querying, it shouldn’t make any appreciable difference.

I would urge you to have evidence that this was worth the effort before spending time on it. And, the “preferredLeader” property is just that, a preference all things being equal. It’s still possible for a leader to be a different replica, otherwise you’d defeat the whole point of trying for HA.

For TLOG and PULL setups, the leader will always be a TLOG replica, so you could strategically place them to get what you want. In this case, the leader indeed has a lot more work to do than the follower so it makes more sense.

Best,
Erick

> On Oct 28, 2019, at 6:13 AM, Koen De Groote <[hidden email]> wrote:
>
> Hello,
>
> I'm looking for a way to configure my collections as such that the leader
> nodes of specific collections never share the same host.
>
> This as a way to prevent several large and/or heavy-usage collections on
> the same machine.
>
> Is this something I can set in solrconfig.xml? Or are there rules for this?
>
> Kind regards,
> Koen De Groote

Reply | Threaded
Open this post in threaded view
|

Re: Leader node on specific host machines?

Koen De Groote
Hello Erick,

Sorry for the late reply. I worked with this setting a bit and it works as
expected.

Indeed, I was not aware of the leader/follower task distribution and what
you say shines a different light on things.

Regardless, I now know about this property and can use it effectively,
which I could not before.

Thanks!

Best regards,
Koen De Groote


On Mon, Oct 28, 2019 at 12:51 PM Erick Erickson <[hidden email]>
wrote:

> There’s the preferredLeader property, see:
> https://lucene.apache.org/solr/guide/6_6/collections-api.html
>
> That said, this was put in for situations where there were 100s of shards
> with replicas from many shards hosted on any given machine, so it was
> possible in that setup to have 100 or more leaders on a single node.
>
> In the usual case, the leader role doesn’t do very much extra work, and
> the extra work is mostly distributing the incoming documents to the
> followers during indexing (mostly I/O). During query time, the leader has
> no extra duties at all. So if “heavy use” means heavy querying, it
> shouldn’t make any appreciable difference.
>
> I would urge you to have evidence that this was worth the effort before
> spending time on it. And, the “preferredLeader” property is just that, a
> preference all things being equal. It’s still possible for a leader to be a
> different replica, otherwise you’d defeat the whole point of trying for HA.
>
> For TLOG and PULL setups, the leader will always be a TLOG replica, so you
> could strategically place them to get what you want. In this case, the
> leader indeed has a lot more work to do than the follower so it makes more
> sense.
>
> Best,
> Erick
>
> > On Oct 28, 2019, at 6:13 AM, Koen De Groote <[hidden email]>
> wrote:
> >
> > Hello,
> >
> > I'm looking for a way to configure my collections as such that the leader
> > nodes of specific collections never share the same host.
> >
> > This as a way to prevent several large and/or heavy-usage collections on
> > the same machine.
> >
> > Is this something I can set in solrconfig.xml? Or are there rules for
> this?
> >
> > Kind regards,
> > Koen De Groote
>
>