Node not recovering, leader elections not occuring

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Node not recovering, leader elections not occuring

Tom Evans
Hi all - problem with a SolrCloud 5.5.0, we have a node that has most
of the collections on it marked as "Recovering" or "Recovery Failed".
It attempts to recover from the leader, but the leader responds with:

Error while trying to recover.
core=iris_shard1_replica1:java.util.concurrent.ExecutionException:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at http://172.31.1.171:30000/solr: We are not the
leader
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:596)
at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:353)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at http://172.31.1.171:30000/solr: We are not the
leader
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:576)
at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:284)
at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:280)
... 5 more

and recovery never occurs.

Each collection in this state has plenty (10+) of active replicas, but
stopping the server that is marked as the leader doesn't trigger a
leader election amongst these replicas.

REBALANCELEADERS did nothing.
FORCELEADER complains that there is already a leader.
FORCELEADER with the purported leader stopped took 45 seconds,
reported status of "0" (and no other message) and kept the down node
as the leader (!)
Deleting the failed collection from the failed node and re-adding it
has the same "Leader said I'm not the leader" error message.

Any other ideas?

Cheers

Tom
Reply | Threaded
Open this post in threaded view
|

Re: Node not recovering, leader elections not occuring

Erick Erickson
How many replicas per Solr JVM? And do you
see any OOM errors when you bounce a server?
And how patient are you being, because it can
take 3 minutes for a leaderless shard to decide
it needs to elect a leader.

See SOLR-7280 and SOLR-7191 for the case
where lots of replicas are in the same JVM,
the tell-tale symptom is errors in the log as you
bring Solr up saying something like
"OutOfMemory error.... unable to create native thread"

SOLR-7280 has patches for 6x and 7x, with a 5x one
being added momentarily.

Best,
Erick

On Tue, Jul 19, 2016 at 7:41 AM, Tom Evans <[hidden email]> wrote:

> Hi all - problem with a SolrCloud 5.5.0, we have a node that has most
> of the collections on it marked as "Recovering" or "Recovery Failed".
> It attempts to recover from the leader, but the leader responds with:
>
> Error while trying to recover.
> core=iris_shard1_replica1:java.util.concurrent.ExecutionException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://172.31.1.171:30000/solr: We are not the
> leader
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:596)
> at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:353)
> at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://172.31.1.171:30000/solr: We are not the
> leader
> at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:576)
> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:284)
> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:280)
> ... 5 more
>
> and recovery never occurs.
>
> Each collection in this state has plenty (10+) of active replicas, but
> stopping the server that is marked as the leader doesn't trigger a
> leader election amongst these replicas.
>
> REBALANCELEADERS did nothing.
> FORCELEADER complains that there is already a leader.
> FORCELEADER with the purported leader stopped took 45 seconds,
> reported status of "0" (and no other message) and kept the down node
> as the leader (!)
> Deleting the failed collection from the failed node and re-adding it
> has the same "Leader said I'm not the leader" error message.
>
> Any other ideas?
>
> Cheers
>
> Tom
Reply | Threaded
Open this post in threaded view
|

Re: Node not recovering, leader elections not occuring

Tom Evans
There are 11 collections, each only has one shard, and each node has
10 replicas (9 collections are on every node, 2 are just on one node).
We're not seeing any OOM errors on restart.

I think we're being patient waiting for the leader election to occur.
We stopped the troublesome "leader that is not the leader" server
about 15-20 minutes ago, but we still have not had a leader election.

Cheers

Tom

On Tue, Jul 19, 2016 at 4:30 PM, Erick Erickson <[hidden email]> wrote:

> How many replicas per Solr JVM? And do you
> see any OOM errors when you bounce a server?
> And how patient are you being, because it can
> take 3 minutes for a leaderless shard to decide
> it needs to elect a leader.
>
> See SOLR-7280 and SOLR-7191 for the case
> where lots of replicas are in the same JVM,
> the tell-tale symptom is errors in the log as you
> bring Solr up saying something like
> "OutOfMemory error.... unable to create native thread"
>
> SOLR-7280 has patches for 6x and 7x, with a 5x one
> being added momentarily.
>
> Best,
> Erick
>
> On Tue, Jul 19, 2016 at 7:41 AM, Tom Evans <[hidden email]> wrote:
>> Hi all - problem with a SolrCloud 5.5.0, we have a node that has most
>> of the collections on it marked as "Recovering" or "Recovery Failed".
>> It attempts to recover from the leader, but the leader responds with:
>>
>> Error while trying to recover.
>> core=iris_shard1_replica1:java.util.concurrent.ExecutionException:
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://172.31.1.171:30000/solr: We are not the
>> leader
>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>> at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:596)
>> at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:353)
>> at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://172.31.1.171:30000/solr: We are not the
>> leader
>> at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:576)
>> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:284)
>> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:280)
>> ... 5 more
>>
>> and recovery never occurs.
>>
>> Each collection in this state has plenty (10+) of active replicas, but
>> stopping the server that is marked as the leader doesn't trigger a
>> leader election amongst these replicas.
>>
>> REBALANCELEADERS did nothing.
>> FORCELEADER complains that there is already a leader.
>> FORCELEADER with the purported leader stopped took 45 seconds,
>> reported status of "0" (and no other message) and kept the down node
>> as the leader (!)
>> Deleting the failed collection from the failed node and re-adding it
>> has the same "Leader said I'm not the leader" error message.
>>
>> Any other ideas?
>>
>> Cheers
>>
>> Tom
Reply | Threaded
Open this post in threaded view
|

Re: Node not recovering, leader elections not occuring

Tom Evans
On the nodes that have the replica in a recovering state we now see:

19-07-2016 16:18:28 ERROR RecoveryStrategy:159 - Error while trying to
recover. core=lookups_shard1_replica8:org.apache.solr.common.SolrException:
No registered leader was found after waiting for 4000ms , collection:
lookups slice: shard1
        at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:607)
        at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:593)
        at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:308)
        at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

19-07-2016 16:18:28 INFO  RecoveryStrategy:444 - Replay not started,
or was not successful... still buffering updates.
19-07-2016 16:18:28 ERROR RecoveryStrategy:481 - Recovery failed -
trying again... (164)
19-07-2016 16:18:28 INFO  RecoveryStrategy:503 - Wait [12.0] seconds
before trying to recover again (attempt=165)


This is with the "leader that is not the leader" shut down.

Issuing a FORCELEADER via collections API doesn't in fact force a
leader election to occur.

Is there any other way to prompt Solr to have an election?

Cheers

Tom

On Tue, Jul 19, 2016 at 5:10 PM, Tom Evans <[hidden email]> wrote:

> There are 11 collections, each only has one shard, and each node has
> 10 replicas (9 collections are on every node, 2 are just on one node).
> We're not seeing any OOM errors on restart.
>
> I think we're being patient waiting for the leader election to occur.
> We stopped the troublesome "leader that is not the leader" server
> about 15-20 minutes ago, but we still have not had a leader election.
>
> Cheers
>
> Tom
>
> On Tue, Jul 19, 2016 at 4:30 PM, Erick Erickson <[hidden email]> wrote:
>> How many replicas per Solr JVM? And do you
>> see any OOM errors when you bounce a server?
>> And how patient are you being, because it can
>> take 3 minutes for a leaderless shard to decide
>> it needs to elect a leader.
>>
>> See SOLR-7280 and SOLR-7191 for the case
>> where lots of replicas are in the same JVM,
>> the tell-tale symptom is errors in the log as you
>> bring Solr up saying something like
>> "OutOfMemory error.... unable to create native thread"
>>
>> SOLR-7280 has patches for 6x and 7x, with a 5x one
>> being added momentarily.
>>
>> Best,
>> Erick
>>
>> On Tue, Jul 19, 2016 at 7:41 AM, Tom Evans <[hidden email]> wrote:
>>> Hi all - problem with a SolrCloud 5.5.0, we have a node that has most
>>> of the collections on it marked as "Recovering" or "Recovery Failed".
>>> It attempts to recover from the leader, but the leader responds with:
>>>
>>> Error while trying to recover.
>>> core=iris_shard1_replica1:java.util.concurrent.ExecutionException:
>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>> Error from server at http://172.31.1.171:30000/solr: We are not the
>>> leader
>>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>>> at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:596)
>>> at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:353)
>>> at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>> Error from server at http://172.31.1.171:30000/solr: We are not the
>>> leader
>>> at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:576)
>>> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:284)
>>> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:280)
>>> ... 5 more
>>>
>>> and recovery never occurs.
>>>
>>> Each collection in this state has plenty (10+) of active replicas, but
>>> stopping the server that is marked as the leader doesn't trigger a
>>> leader election amongst these replicas.
>>>
>>> REBALANCELEADERS did nothing.
>>> FORCELEADER complains that there is already a leader.
>>> FORCELEADER with the purported leader stopped took 45 seconds,
>>> reported status of "0" (and no other message) and kept the down node
>>> as the leader (!)
>>> Deleting the failed collection from the failed node and re-adding it
>>> has the same "Leader said I'm not the leader" error message.
>>>
>>> Any other ideas?
>>>
>>> Cheers
>>>
>>> Tom
Reply | Threaded
Open this post in threaded view
|

Re: Node not recovering, leader elections not occuring

Jeff Wartes
It sounds like the node-local version of the ZK clusterstate has diverged from the ZK cluster state. You should check the contents of zookeeper and verify the state there looks sane. I’ve had issues (v5.4) on a few occasions where leader election got screwed up to the point where I had to delete data from ZK manually. (Usually after a ZK issue.)

Particularly, take a look in
collections/<collection>/leader_elect/shard<x>/election

Non-sane states would be the same core_nodeX listed more than once, or fewer entries than (up) replicas. If you’re having trouble getting a sane election, you can try deleting the lowest-numbered entries (as well as any lower-numbered duplicates) and trying forceelection again. Possibly followed by restarting the node with that lowest-numbered entry.

Also make sure that this exists and has the expected replica
collections/<collection>/leader/shard<x>

collections/<collection>/leader_initiated_recovery
can be informative too, this represents replicas that the *leader* thinks are out of sync, usually due to a failed update request.


On 7/19/16, 9:20 AM, "Tom Evans" <[hidden email]> wrote:

    On the nodes that have the replica in a recovering state we now see:
   
    19-07-2016 16:18:28 ERROR RecoveryStrategy:159 - Error while trying to
    recover. core=lookups_shard1_replica8:org.apache.solr.common.SolrException:
    No registered leader was found after waiting for 4000ms , collection:
    lookups slice: shard1
            at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:607)
            at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:593)
            at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:308)
            at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
   
    19-07-2016 16:18:28 INFO  RecoveryStrategy:444 - Replay not started,
    or was not successful... still buffering updates.
    19-07-2016 16:18:28 ERROR RecoveryStrategy:481 - Recovery failed -
    trying again... (164)
    19-07-2016 16:18:28 INFO  RecoveryStrategy:503 - Wait [12.0] seconds
    before trying to recover again (attempt=165)
   
   
    This is with the "leader that is not the leader" shut down.
   
    Issuing a FORCELEADER via collections API doesn't in fact force a
    leader election to occur.
   
    Is there any other way to prompt Solr to have an election?
   
    Cheers
   
    Tom
   
    On Tue, Jul 19, 2016 at 5:10 PM, Tom Evans <[hidden email]> wrote:
    > There are 11 collections, each only has one shard, and each node has
    > 10 replicas (9 collections are on every node, 2 are just on one node).
    > We're not seeing any OOM errors on restart.
    >
    > I think we're being patient waiting for the leader election to occur.
    > We stopped the troublesome "leader that is not the leader" server
    > about 15-20 minutes ago, but we still have not had a leader election.
    >
    > Cheers
    >
    > Tom
    >
    > On Tue, Jul 19, 2016 at 4:30 PM, Erick Erickson <[hidden email]> wrote:
    >> How many replicas per Solr JVM? And do you
    >> see any OOM errors when you bounce a server?
    >> And how patient are you being, because it can
    >> take 3 minutes for a leaderless shard to decide
    >> it needs to elect a leader.
    >>
    >> See SOLR-7280 and SOLR-7191 for the case
    >> where lots of replicas are in the same JVM,
    >> the tell-tale symptom is errors in the log as you
    >> bring Solr up saying something like
    >> "OutOfMemory error.... unable to create native thread"
    >>
    >> SOLR-7280 has patches for 6x and 7x, with a 5x one
    >> being added momentarily.
    >>
    >> Best,
    >> Erick
    >>
    >> On Tue, Jul 19, 2016 at 7:41 AM, Tom Evans <[hidden email]> wrote:
    >>> Hi all - problem with a SolrCloud 5.5.0, we have a node that has most
    >>> of the collections on it marked as "Recovering" or "Recovery Failed".
    >>> It attempts to recover from the leader, but the leader responds with:
    >>>
    >>> Error while trying to recover.
    >>> core=iris_shard1_replica1:java.util.concurrent.ExecutionException:
    >>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
    >>> Error from server at http://172.31.1.171:30000/solr: We are not the
    >>> leader
    >>> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    >>> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    >>> at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:596)
    >>> at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:353)
    >>> at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:224)
    >>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    >>> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231)
    >>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    >>> at java.lang.Thread.run(Thread.java:745)
    >>> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
    >>> Error from server at http://172.31.1.171:30000/solr: We are not the
    >>> leader
    >>> at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:576)
    >>> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:284)
    >>> at org.apache.solr.client.solrj.impl.HttpSolrClient$1.call(HttpSolrClient.java:280)
    >>> ... 5 more
    >>>
    >>> and recovery never occurs.
    >>>
    >>> Each collection in this state has plenty (10+) of active replicas, but
    >>> stopping the server that is marked as the leader doesn't trigger a
    >>> leader election amongst these replicas.
    >>>
    >>> REBALANCELEADERS did nothing.
    >>> FORCELEADER complains that there is already a leader.
    >>> FORCELEADER with the purported leader stopped took 45 seconds,
    >>> reported status of "0" (and no other message) and kept the down node
    >>> as the leader (!)
    >>> Deleting the failed collection from the failed node and re-adding it
    >>> has the same "Leader said I'm not the leader" error message.
    >>>
    >>> Any other ideas?
    >>>
    >>> Cheers
    >>>
    >>> Tom