REBALANCELEADERS is not reliable

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

REBALANCELEADERS is not reliable

Bernd Fehling
Hi list,

unfortunately REBALANCELEADERS is not reliable and the leader
election has unpredictable results with SolrCloud 6.6.5 and
Zookeeper 3.4.10.
Seen with 5 shards / 3 replicas.

- CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
- setting with ADDREPLICAPROP the property preferredLeader to other replicas
- calling REBALANCELEADERS
- some leaders have changed, some not.

I then tried:
- removing all preferredLeader properties from replicas which succeeded.
- trying again REBALANCELEADERS for the rest. No success.
- Shutting down nodes to force the leader to a specific replica left running.
   No success.
- calling REBALANCELEADERS responds that the replica is inactive!!!
- calling CLUSTERSTATUS reports that the replica is active!!!

Also, the replica which don't want to become leader is not in the list
of collections->[collection_name]->leader_elect->shard1..x->election

Where is CLUSTERSTATUS getting it's state info from?

Has anyone else problems with REBALANCELEADERS?

I noticed that the Reference Guide writes "preferredLeader" (with capital "L")
but the JAVA code has "preferredleader".

Regards, Bernd
Reply | Threaded
Open this post in threaded view
|

RE: REBALANCELEADERS is not reliable

Вадим Иванов
Hi, Bernd
I have tried REBALANCELEADERS with Solr 6.3 and 7.5
I had very similar results and notion that it's not reliable :(
--
Br, Vadim


> -----Original Message-----
> From: Bernd Fehling [mailto:[hidden email]]
> Sent: Tuesday, November 27, 2018 5:13 PM
> To: [hidden email]
> Subject: REBALANCELEADERS is not reliable
>
> Hi list,
>
> unfortunately REBALANCELEADERS is not reliable and the leader
> election has unpredictable results with SolrCloud 6.6.5 and
> Zookeeper 3.4.10.
> Seen with 5 shards / 3 replicas.
>
> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
> - setting with ADDREPLICAPROP the property preferredLeader to other replicas
> - calling REBALANCELEADERS
> - some leaders have changed, some not.
>
> I then tried:
> - removing all preferredLeader properties from replicas which succeeded.
> - trying again REBALANCELEADERS for the rest. No success.
> - Shutting down nodes to force the leader to a specific replica left running.
>    No success.
> - calling REBALANCELEADERS responds that the replica is inactive!!!
> - calling CLUSTERSTATUS reports that the replica is active!!!
>
> Also, the replica which don't want to become leader is not in the list
> of collections->[collection_name]->leader_elect->shard1..x->election
>
> Where is CLUSTERSTATUS getting it's state info from?
>
> Has anyone else problems with REBALANCELEADERS?
>
> I noticed that the Reference Guide writes "preferredLeader" (with capital "L")
> but the JAVA code has "preferredleader".
>
> Regards, Bernd

Reply | Threaded
Open this post in threaded view
|

Re: REBALANCELEADERS is not reliable

Bernd Fehling
In reply to this post by Bernd Fehling
Hi Vadim,

thanks for confirming.
So it seems to be a general problem with Solr 6.x, 7.x and might
be still there in the most recent versions.

But where to start to debug this problem, is it something not
correctly stored in zookeeper or is overseer the problem?

I was also reading something about a "leader queue" where possible
leaders have to be requeued or something similar.

May be I should try to get a situation where a "locked" core
is on the overseer and then connect the debugger to it and step
through it.
Peeking and poking around, like old Commodore 64 days :-)

Regards, Bernd


Am 27.11.18 um 15:47 schrieb Vadim Ivanov:

> Hi, Bernd
> I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> I had very similar results and notion that it's not reliable :(
> --
> Br, Vadim
>
>> -----Original Message-----
>> From: Bernd Fehling [mailto:[hidden email]]
>> Sent: Tuesday, November 27, 2018 5:13 PM
>> To: [hidden email]
>> Subject: REBALANCELEADERS is not reliable
>>
>> Hi list,
>>
>> unfortunately REBALANCELEADERS is not reliable and the leader
>> election has unpredictable results with SolrCloud 6.6.5 and
>> Zookeeper 3.4.10.
>> Seen with 5 shards / 3 replicas.
>>
>> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
>> - setting with ADDREPLICAPROP the property preferredLeader to other replicas
>> - calling REBALANCELEADERS
>> - some leaders have changed, some not.
>>
>> I then tried:
>> - removing all preferredLeader properties from replicas which succeeded.
>> - trying again REBALANCELEADERS for the rest. No success.
>> - Shutting down nodes to force the leader to a specific replica left running.
>>    No success.
>> - calling REBALANCELEADERS responds that the replica is inactive!!!
>> - calling CLUSTERSTATUS reports that the replica is active!!!
>>
>> Also, the replica which don't want to become leader is not in the list
>> of collections->[collection_name]->leader_elect->shard1..x->election
>>
>> Where is CLUSTERSTATUS getting it's state info from?
>>
>> Has anyone else problems with REBALANCELEADERS?
>>
>> I noticed that the Reference Guide writes "preferredLeader" (with capital "L")
>> but the JAVA code has "preferredleader".
>>
>> Regards, Bernd
>
Reply | Threaded
Open this post in threaded view
|

Re: REBALANCELEADERS is not reliable

Aman Tandon
For me today, I deleted the leader replica of one of the two shard
collection. Then other replica of that shard was getting elected for leader.

After waiting for long tried the setting addreplicaprop preferred leader on
one of the replica then tried FORCELEADER but no luck. Then also tried
rebalance but no help. Finally have to recreate the whole collection.

Not sure what was the issue but both FORCELEADER AND REBALANCING didn't
work if there was no leader however preferred leader property was setted.

On Wed, Nov 28, 2018, 12:54 Bernd Fehling <[hidden email]
wrote:

> Hi Vadim,
>
> thanks for confirming.
> So it seems to be a general problem with Solr 6.x, 7.x and might
> be still there in the most recent versions.
>
> But where to start to debug this problem, is it something not
> correctly stored in zookeeper or is overseer the problem?
>
> I was also reading something about a "leader queue" where possible
> leaders have to be requeued or something similar.
>
> May be I should try to get a situation where a "locked" core
> is on the overseer and then connect the debugger to it and step
> through it.
> Peeking and poking around, like old Commodore 64 days :-)
>
> Regards, Bernd
>
>
> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
> > Hi, Bernd
> > I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> > I had very similar results and notion that it's not reliable :(
> > --
> > Br, Vadim
> >
> >> -----Original Message-----
> >> From: Bernd Fehling [mailto:[hidden email]]
> >> Sent: Tuesday, November 27, 2018 5:13 PM
> >> To: [hidden email]
> >> Subject: REBALANCELEADERS is not reliable
> >>
> >> Hi list,
> >>
> >> unfortunately REBALANCELEADERS is not reliable and the leader
> >> election has unpredictable results with SolrCloud 6.6.5 and
> >> Zookeeper 3.4.10.
> >> Seen with 5 shards / 3 replicas.
> >>
> >> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
> >> - setting with ADDREPLICAPROP the property preferredLeader to other
> replicas
> >> - calling REBALANCELEADERS
> >> - some leaders have changed, some not.
> >>
> >> I then tried:
> >> - removing all preferredLeader properties from replicas which succeeded.
> >> - trying again REBALANCELEADERS for the rest. No success.
> >> - Shutting down nodes to force the leader to a specific replica left
> running.
> >>    No success.
> >> - calling REBALANCELEADERS responds that the replica is inactive!!!
> >> - calling CLUSTERSTATUS reports that the replica is active!!!
> >>
> >> Also, the replica which don't want to become leader is not in the list
> >> of collections->[collection_name]->leader_elect->shard1..x->election
> >>
> >> Where is CLUSTERSTATUS getting it's state info from?
> >>
> >> Has anyone else problems with REBALANCELEADERS?
> >>
> >> I noticed that the Reference Guide writes "preferredLeader" (with
> capital "L")
> >> but the JAVA code has "preferredleader".
> >>
> >> Regards, Bernd
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: REBALANCELEADERS is not reliable

Aman Tandon
++ correction

On Fri, Nov 30, 2018, 01:10 Aman Tandon <[hidden email] wrote:

> For me today, I deleted the leader replica of one of the two shard
> collection. Then other replicas of that shard wasn't getting elected for
> leader.
>
> After waiting for long tried the setting addreplicaprop preferred leader
> on one of the replica then tried FORCELEADER but no luck. Then also tried
> rebalance but no help. Finally have to recreate the whole collection.
>
> Not sure what was the issue but both FORCELEADER AND REBALANCING didn't
> work if there was no leader however preferred leader property was setted.
>
> On Wed, Nov 28, 2018, 12:54 Bernd Fehling <[hidden email]
> wrote:
>
>> Hi Vadim,
>>
>> thanks for confirming.
>> So it seems to be a general problem with Solr 6.x, 7.x and might
>> be still there in the most recent versions.
>>
>> But where to start to debug this problem, is it something not
>> correctly stored in zookeeper or is overseer the problem?
>>
>> I was also reading something about a "leader queue" where possible
>> leaders have to be requeued or something similar.
>>
>> May be I should try to get a situation where a "locked" core
>> is on the overseer and then connect the debugger to it and step
>> through it.
>> Peeking and poking around, like old Commodore 64 days :-)
>>
>> Regards, Bernd
>>
>>
>> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
>> > Hi, Bernd
>> > I have tried REBALANCELEADERS with Solr 6.3 and 7.5
>> > I had very similar results and notion that it's not reliable :(
>> > --
>> > Br, Vadim
>> >
>> >> -----Original Message-----
>> >> From: Bernd Fehling [mailto:[hidden email]]
>> >> Sent: Tuesday, November 27, 2018 5:13 PM
>> >> To: [hidden email]
>> >> Subject: REBALANCELEADERS is not reliable
>> >>
>> >> Hi list,
>> >>
>> >> unfortunately REBALANCELEADERS is not reliable and the leader
>> >> election has unpredictable results with SolrCloud 6.6.5 and
>> >> Zookeeper 3.4.10.
>> >> Seen with 5 shards / 3 replicas.
>> >>
>> >> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
>> >> - setting with ADDREPLICAPROP the property preferredLeader to other
>> replicas
>> >> - calling REBALANCELEADERS
>> >> - some leaders have changed, some not.
>> >>
>> >> I then tried:
>> >> - removing all preferredLeader properties from replicas which
>> succeeded.
>> >> - trying again REBALANCELEADERS for the rest. No success.
>> >> - Shutting down nodes to force the leader to a specific replica left
>> running.
>> >>    No success.
>> >> - calling REBALANCELEADERS responds that the replica is inactive!!!
>> >> - calling CLUSTERSTATUS reports that the replica is active!!!
>> >>
>> >> Also, the replica which don't want to become leader is not in the list
>> >> of collections->[collection_name]->leader_elect->shard1..x->election
>> >>
>> >> Where is CLUSTERSTATUS getting it's state info from?
>> >>
>> >> Has anyone else problems with REBALANCELEADERS?
>> >>
>> >> I noticed that the Reference Guide writes "preferredLeader" (with
>> capital "L")
>> >> but the JAVA code has "preferredleader".
>> >>
>> >> Regards, Bernd
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: REBALANCELEADERS is not reliable

Atita Arora
Indeed, I tried that on 7.4 & 7.5 too, indeed did not work for me as well,
even with the preferredLeader property as recommended in the documentation.
I handled it with a little hack but certainly this dint work as expected.
I can provide more details if there's a ticket.

On Thu, Nov 29, 2018 at 8:42 PM Aman Tandon <[hidden email]> wrote:

> ++ correction
>
> On Fri, Nov 30, 2018, 01:10 Aman Tandon <[hidden email] wrote:
>
> > For me today, I deleted the leader replica of one of the two shard
> > collection. Then other replicas of that shard wasn't getting elected for
> > leader.
> >
> > After waiting for long tried the setting addreplicaprop preferred leader
> > on one of the replica then tried FORCELEADER but no luck. Then also tried
> > rebalance but no help. Finally have to recreate the whole collection.
> >
> > Not sure what was the issue but both FORCELEADER AND REBALANCING didn't
> > work if there was no leader however preferred leader property was setted.
> >
> > On Wed, Nov 28, 2018, 12:54 Bernd Fehling <
> [hidden email]
> > wrote:
> >
> >> Hi Vadim,
> >>
> >> thanks for confirming.
> >> So it seems to be a general problem with Solr 6.x, 7.x and might
> >> be still there in the most recent versions.
> >>
> >> But where to start to debug this problem, is it something not
> >> correctly stored in zookeeper or is overseer the problem?
> >>
> >> I was also reading something about a "leader queue" where possible
> >> leaders have to be requeued or something similar.
> >>
> >> May be I should try to get a situation where a "locked" core
> >> is on the overseer and then connect the debugger to it and step
> >> through it.
> >> Peeking and poking around, like old Commodore 64 days :-)
> >>
> >> Regards, Bernd
> >>
> >>
> >> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
> >> > Hi, Bernd
> >> > I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> >> > I had very similar results and notion that it's not reliable :(
> >> > --
> >> > Br, Vadim
> >> >
> >> >> -----Original Message-----
> >> >> From: Bernd Fehling [mailto:[hidden email]]
> >> >> Sent: Tuesday, November 27, 2018 5:13 PM
> >> >> To: [hidden email]
> >> >> Subject: REBALANCELEADERS is not reliable
> >> >>
> >> >> Hi list,
> >> >>
> >> >> unfortunately REBALANCELEADERS is not reliable and the leader
> >> >> election has unpredictable results with SolrCloud 6.6.5 and
> >> >> Zookeeper 3.4.10.
> >> >> Seen with 5 shards / 3 replicas.
> >> >>
> >> >> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
> >> >> - setting with ADDREPLICAPROP the property preferredLeader to other
> >> replicas
> >> >> - calling REBALANCELEADERS
> >> >> - some leaders have changed, some not.
> >> >>
> >> >> I then tried:
> >> >> - removing all preferredLeader properties from replicas which
> >> succeeded.
> >> >> - trying again REBALANCELEADERS for the rest. No success.
> >> >> - Shutting down nodes to force the leader to a specific replica left
> >> running.
> >> >>    No success.
> >> >> - calling REBALANCELEADERS responds that the replica is inactive!!!
> >> >> - calling CLUSTERSTATUS reports that the replica is active!!!
> >> >>
> >> >> Also, the replica which don't want to become leader is not in the
> list
> >> >> of collections->[collection_name]->leader_elect->shard1..x->election
> >> >>
> >> >> Where is CLUSTERSTATUS getting it's state info from?
> >> >>
> >> >> Has anyone else problems with REBALANCELEADERS?
> >> >>
> >> >> I noticed that the Reference Guide writes "preferredLeader" (with
> >> capital "L")
> >> >> but the JAVA code has "preferredleader".
> >> >>
> >> >> Regards, Bernd
> >> >
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: REBALANCELEADERS is not reliable

Вадим Иванов
Is solr-dev forum I came across this post
http://lucene.472066.n3.nabble.com/Rebalance-Leaders-Leader-node-deleted-when-rebalancing-leaders-td4417040.html
May be it will shed some light?

--
Vadim

> -----Original Message-----
> From: Atita Arora [mailto:[hidden email]]
> Sent: Thursday, November 29, 2018 11:03 PM
> To: [hidden email]
> Subject: Re: REBALANCELEADERS is not reliable
>
> Indeed, I tried that on 7.4 & 7.5 too, indeed did not work for me as well,
> even with the preferredLeader property as recommended in the
> documentation.
> I handled it with a little hack but certainly this dint work as expected.
> I can provide more details if there's a ticket.
>
> On Thu, Nov 29, 2018 at 8:42 PM Aman Tandon
> <[hidden email]> wrote:
>
> > ++ correction
> >
> > On Fri, Nov 30, 2018, 01:10 Aman Tandon <[hidden email]
> wrote:
> >
> > > For me today, I deleted the leader replica of one of the two shard
> > > collection. Then other replicas of that shard wasn't getting elected for
> > > leader.
> > >
> > > After waiting for long tried the setting addreplicaprop preferred leader
> > > on one of the replica then tried FORCELEADER but no luck. Then also tried
> > > rebalance but no help. Finally have to recreate the whole collection.
> > >
> > > Not sure what was the issue but both FORCELEADER AND REBALANCING
> didn't
> > > work if there was no leader however preferred leader property was setted.
> > >
> > > On Wed, Nov 28, 2018, 12:54 Bernd Fehling <
> > [hidden email]
> > > wrote:
> > >
> > >> Hi Vadim,
> > >>
> > >> thanks for confirming.
> > >> So it seems to be a general problem with Solr 6.x, 7.x and might
> > >> be still there in the most recent versions.
> > >>
> > >> But where to start to debug this problem, is it something not
> > >> correctly stored in zookeeper or is overseer the problem?
> > >>
> > >> I was also reading something about a "leader queue" where possible
> > >> leaders have to be requeued or something similar.
> > >>
> > >> May be I should try to get a situation where a "locked" core
> > >> is on the overseer and then connect the debugger to it and step
> > >> through it.
> > >> Peeking and poking around, like old Commodore 64 days :-)
> > >>
> > >> Regards, Bernd
> > >>
> > >>
> > >> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
> > >> > Hi, Bernd
> > >> > I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> > >> > I had very similar results and notion that it's not reliable :(
> > >> > --
> > >> > Br, Vadim
> > >> >
> > >> >> -----Original Message-----
> > >> >> From: Bernd Fehling [mailto:[hidden email]]
> > >> >> Sent: Tuesday, November 27, 2018 5:13 PM
> > >> >> To: [hidden email]
> > >> >> Subject: REBALANCELEADERS is not reliable
> > >> >>
> > >> >> Hi list,
> > >> >>
> > >> >> unfortunately REBALANCELEADERS is not reliable and the leader
> > >> >> election has unpredictable results with SolrCloud 6.6.5 and
> > >> >> Zookeeper 3.4.10.
> > >> >> Seen with 5 shards / 3 replicas.
> > >> >>
> > >> >> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
> > >> >> - setting with ADDREPLICAPROP the property preferredLeader to other
> > >> replicas
> > >> >> - calling REBALANCELEADERS
> > >> >> - some leaders have changed, some not.
> > >> >>
> > >> >> I then tried:
> > >> >> - removing all preferredLeader properties from replicas which
> > >> succeeded.
> > >> >> - trying again REBALANCELEADERS for the rest. No success.
> > >> >> - Shutting down nodes to force the leader to a specific replica left
> > >> running.
> > >> >>    No success.
> > >> >> - calling REBALANCELEADERS responds that the replica is inactive!!!
> > >> >> - calling CLUSTERSTATUS reports that the replica is active!!!
> > >> >>
> > >> >> Also, the replica which don't want to become leader is not in the
> > list
> > >> >> of collections->[collection_name]->leader_elect->shard1..x->election
> > >> >>
> > >> >> Where is CLUSTERSTATUS getting it's state info from?
> > >> >>
> > >> >> Has anyone else problems with REBALANCELEADERS?
> > >> >>
> > >> >> I noticed that the Reference Guide writes "preferredLeader" (with
> > >> capital "L")
> > >> >> but the JAVA code has "preferredleader".
> > >> >>
> > >> >> Regards, Bernd
> > >> >
> > >>
> > >
> >

Reply | Threaded
Open this post in threaded view
|

Re: REBALANCELEADERS is not reliable

Bernd Fehling
In reply to this post by Atita Arora
Thanks for looking this up.
It could be a hint where to jump into the code.
I wonder why they rejected a jira ticket about this problem?

Regards, Bernd

Am 06.12.18 um 16:31 schrieb Vadim Ivanov:

> Is solr-dev forum I came across this post
> http://lucene.472066.n3.nabble.com/Rebalance-Leaders-Leader-node-deleted-when-rebalancing-leaders-td4417040.html
> May be it will shed some light?
>
>
>> -----Original Message-----
>> From: Atita Arora [mailto:[hidden email]]
>> Sent: Thursday, November 29, 2018 11:03 PM
>> To: [hidden email]
>> Subject: Re: REBALANCELEADERS is not reliable
>>
>> Indeed, I tried that on 7.4 & 7.5 too, indeed did not work for me as well,
>> even with the preferredLeader property as recommended in the
>> documentation.
>> I handled it with a little hack but certainly this dint work as expected.
>> I can provide more details if there's a ticket.
>>
>> On Thu, Nov 29, 2018 at 8:42 PM Aman Tandon
>> <[hidden email]> wrote:
>>
>>> ++ correction
>>>
>>> On Fri, Nov 30, 2018, 01:10 Aman Tandon <[hidden email]
>> wrote:
>>>
>>>> For me today, I deleted the leader replica of one of the two shard
>>>> collection. Then other replicas of that shard wasn't getting elected for
>>>> leader.
>>>>
>>>> After waiting for long tried the setting addreplicaprop preferred leader
>>>> on one of the replica then tried FORCELEADER but no luck. Then also tried
>>>> rebalance but no help. Finally have to recreate the whole collection.
>>>>
>>>> Not sure what was the issue but both FORCELEADER AND REBALANCING
>> didn't
>>>> work if there was no leader however preferred leader property was setted.
>>>>
>>>> On Wed, Nov 28, 2018, 12:54 Bernd Fehling <
>>> [hidden email]
>>>> wrote:
>>>>
>>>>> Hi Vadim,
>>>>>
>>>>> thanks for confirming.
>>>>> So it seems to be a general problem with Solr 6.x, 7.x and might
>>>>> be still there in the most recent versions.
>>>>>
>>>>> But where to start to debug this problem, is it something not
>>>>> correctly stored in zookeeper or is overseer the problem?
>>>>>
>>>>> I was also reading something about a "leader queue" where possible
>>>>> leaders have to be requeued or something similar.
>>>>>
>>>>> May be I should try to get a situation where a "locked" core
>>>>> is on the overseer and then connect the debugger to it and step
>>>>> through it.
>>>>> Peeking and poking around, like old Commodore 64 days :-)
>>>>>
>>>>> Regards, Bernd
>>>>>
>>>>>
>>>>> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
>>>>>> Hi, Bernd
>>>>>> I have tried REBALANCELEADERS with Solr 6.3 and 7.5
>>>>>> I had very similar results and notion that it's not reliable :(
>>>>>> --
>>>>>> Br, Vadim
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Bernd Fehling [mailto:[hidden email]]
>>>>>>> Sent: Tuesday, November 27, 2018 5:13 PM
>>>>>>> To: [hidden email]
>>>>>>> Subject: REBALANCELEADERS is not reliable
>>>>>>>
>>>>>>> Hi list,
>>>>>>>
>>>>>>> unfortunately REBALANCELEADERS is not reliable and the leader
>>>>>>> election has unpredictable results with SolrCloud 6.6.5 and
>>>>>>> Zookeeper 3.4.10.
>>>>>>> Seen with 5 shards / 3 replicas.
>>>>>>>
>>>>>>> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
>>>>>>> - setting with ADDREPLICAPROP the property preferredLeader to other
>>>>> replicas
>>>>>>> - calling REBALANCELEADERS
>>>>>>> - some leaders have changed, some not.
>>>>>>>
>>>>>>> I then tried:
>>>>>>> - removing all preferredLeader properties from replicas which
>>>>> succeeded.
>>>>>>> - trying again REBALANCELEADERS for the rest. No success.
>>>>>>> - Shutting down nodes to force the leader to a specific replica left
>>>>> running.
>>>>>>>    No success.
>>>>>>> - calling REBALANCELEADERS responds that the replica is inactive!!!
>>>>>>> - calling CLUSTERSTATUS reports that the replica is active!!!
>>>>>>>
>>>>>>> Also, the replica which don't want to become leader is not in the
>>> list
>>>>>>> of collections->[collection_name]->leader_elect->shard1..x->election
>>>>>>>
>>>>>>> Where is CLUSTERSTATUS getting it's state info from?
>>>>>>>
>>>>>>> Has anyone else problems with REBALANCELEADERS?
>>>>>>>
>>>>>>> I noticed that the Reference Guide writes "preferredLeader" (with
>>>>> capital "L")
>>>>>>> but the JAVA code has "preferredleader".
>>>>>>>
>>>>>>> Regards, Bernd
>>>>>>
>>>>>
>>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

RE: REBALANCELEADERS is not reliable

Вадим Иванов
I'm waiting for 7.6 or 7.5.1 and plan to apply patch from  Endika Posadas to it.
Then test again and hope it'll help
--
Vadim


> -----Original Message-----
> From: Bernd Fehling [mailto:[hidden email]]
> Sent: Friday, December 07, 2018 12:01 PM
> To: [hidden email]
> Subject: Re: REBALANCELEADERS is not reliable
>
> Thanks for looking this up.
> It could be a hint where to jump into the code.
> I wonder why they rejected a jira ticket about this problem?
>
> Regards, Bernd
>
> Am 06.12.18 um 16:31 schrieb Vadim Ivanov:
> > Is solr-dev forum I came across this post
> > http://lucene.472066.n3.nabble.com/Rebalance-Leaders-Leader-node-
> deleted-when-rebalancing-leaders-td4417040.html
> > May be it will shed some light?
> >
> >
> >> -----Original Message-----
> >> From: Atita Arora [mailto:[hidden email]]
> >> Sent: Thursday, November 29, 2018 11:03 PM
> >> To: [hidden email]
> >> Subject: Re: REBALANCELEADERS is not reliable
> >>
> >> Indeed, I tried that on 7.4 & 7.5 too, indeed did not work for me as well,
> >> even with the preferredLeader property as recommended in the
> >> documentation.
> >> I handled it with a little hack but certainly this dint work as expected.
> >> I can provide more details if there's a ticket.
> >>
> >> On Thu, Nov 29, 2018 at 8:42 PM Aman Tandon
> >> <[hidden email]> wrote:
> >>
> >>> ++ correction
> >>>
> >>> On Fri, Nov 30, 2018, 01:10 Aman Tandon <[hidden email]
> >> wrote:
> >>>
> >>>> For me today, I deleted the leader replica of one of the two shard
> >>>> collection. Then other replicas of that shard wasn't getting elected for
> >>>> leader.
> >>>>
> >>>> After waiting for long tried the setting addreplicaprop preferred leader
> >>>> on one of the replica then tried FORCELEADER but no luck. Then also
> tried
> >>>> rebalance but no help. Finally have to recreate the whole collection.
> >>>>
> >>>> Not sure what was the issue but both FORCELEADER AND REBALANCING
> >> didn't
> >>>> work if there was no leader however preferred leader property was
> setted.
> >>>>
> >>>> On Wed, Nov 28, 2018, 12:54 Bernd Fehling <
> >>> [hidden email]
> >>>> wrote:
> >>>>
> >>>>> Hi Vadim,
> >>>>>
> >>>>> thanks for confirming.
> >>>>> So it seems to be a general problem with Solr 6.x, 7.x and might
> >>>>> be still there in the most recent versions.
> >>>>>
> >>>>> But where to start to debug this problem, is it something not
> >>>>> correctly stored in zookeeper or is overseer the problem?
> >>>>>
> >>>>> I was also reading something about a "leader queue" where possible
> >>>>> leaders have to be requeued or something similar.
> >>>>>
> >>>>> May be I should try to get a situation where a "locked" core
> >>>>> is on the overseer and then connect the debugger to it and step
> >>>>> through it.
> >>>>> Peeking and poking around, like old Commodore 64 days :-)
> >>>>>
> >>>>> Regards, Bernd
> >>>>>
> >>>>>
> >>>>> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
> >>>>>> Hi, Bernd
> >>>>>> I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> >>>>>> I had very similar results and notion that it's not reliable :(
> >>>>>> --
> >>>>>> Br, Vadim
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Bernd Fehling [mailto:[hidden email]]
> >>>>>>> Sent: Tuesday, November 27, 2018 5:13 PM
> >>>>>>> To: [hidden email]
> >>>>>>> Subject: REBALANCELEADERS is not reliable
> >>>>>>>
> >>>>>>> Hi list,
> >>>>>>>
> >>>>>>> unfortunately REBALANCELEADERS is not reliable and the leader
> >>>>>>> election has unpredictable results with SolrCloud 6.6.5 and
> >>>>>>> Zookeeper 3.4.10.
> >>>>>>> Seen with 5 shards / 3 replicas.
> >>>>>>>
> >>>>>>> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
> >>>>>>> - setting with ADDREPLICAPROP the property preferredLeader to
> other
> >>>>> replicas
> >>>>>>> - calling REBALANCELEADERS
> >>>>>>> - some leaders have changed, some not.
> >>>>>>>
> >>>>>>> I then tried:
> >>>>>>> - removing all preferredLeader properties from replicas which
> >>>>> succeeded.
> >>>>>>> - trying again REBALANCELEADERS for the rest. No success.
> >>>>>>> - Shutting down nodes to force the leader to a specific replica left
> >>>>> running.
> >>>>>>>    No success.
> >>>>>>> - calling REBALANCELEADERS responds that the replica is inactive!!!
> >>>>>>> - calling CLUSTERSTATUS reports that the replica is active!!!
> >>>>>>>
> >>>>>>> Also, the replica which don't want to become leader is not in the
> >>> list
> >>>>>>> of collections->[collection_name]->leader_elect->shard1..x->election
> >>>>>>>
> >>>>>>> Where is CLUSTERSTATUS getting it's state info from?
> >>>>>>>
> >>>>>>> Has anyone else problems with REBALANCELEADERS?
> >>>>>>>
> >>>>>>> I noticed that the Reference Guide writes "preferredLeader" (with
> >>>>> capital "L")
> >>>>>>> but the JAVA code has "preferredleader".
> >>>>>>>
> >>>>>>> Regards, Bernd
> >>>>>>
> >>>>>
> >>>>
> >>>
> >