Rebalance Leaders: Leader node deleted when rebalancing leaders

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Rebalance Leaders: Leader node deleted when rebalancing leaders

Endika Posadas
Hello everyone,

I was going to open a jira ticket with the following content, but Jira forwarded me to send an email to the Solr mail list instead.


I had the following problems in Solr  that I fixed with the attached patch.
the nodes are in the following way:

If node1 is the leader you have
node1(1) <- node2(2) <- node3(3) <- node4(4)
where <- means "watches". and the node Sequence is between parenthesis.


Now, node4 becomes the first watcher:

node1(1)<- Node2(2)

            <-Node4(2) <- Node 3(3)


When node1 goes down, Node2 will try to set itself as the first watcher.  For that, it will delete it's current node:

Node4(2)<-Node3(3)

And set itself as the first watcher.

Node2(2)

Node4(2) <-Node3(3)

To verify that the node has been set at the front of the queue, Solr will check that the sequence number has changed.

And there lies the problem: since the Sequence number is the same, solr won't detect the node and the Node will be unable to become leader.


Instead, solr could check if the Node is already the first watcher and if it is, just send the nodes with the duplicate Sequence Id to the back of the queue. Then, since the node is already the first watcher, there will be no need to position it as the first watcher again before trying to become leader.





---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Fixed_leader_node_deletion_when_rebalancing_leaders.patch (15K) Download Attachment