Solr behaves wonky when zookeeper quorom is messed up.

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr behaves wonky when zookeeper quorom is messed up.

harjags
In our PROD SOLR cluster(7.6 and ZK:3.4.9) when Zookeeper leader fails
Zookeeper enter an infinite leader election loop which makes SOLR instable.
Solr Fails to index(as Expected with error "Remote error message: Cannot
talk to ZooKeeper - Updates are disabled") and CPU spikes up.

This is a know ZK bug https://issues.apache.org/jira/browse/ZOOKEEPER-2164.
below are my question:

1. Solr behaving Wonky when ZK quorom is affected is expected but is there a
work around?
2. The ZK bug is fixed in 3.5.6  is solr7.6 compatible with ZK 3.5.6?
3. Anyone have an opinion around "fast leader election" vs "original
UDP-based" of ZK. Will cahnge election algorithm version from 3->0 solve the
issue?






--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr behaves wonky when zookeeper quorom is messed up.

Shawn Heisey-2
On 9/19/2019 9:22 AM, harjagsbby wrote:

> In our PROD SOLR cluster(7.6 and ZK:3.4.9) when Zookeeper leader fails
> Zookeeper enter an infinite leader election loop which makes SOLR instable.
> Solr Fails to index(as Expected with error "Remote error message: Cannot
> talk to ZooKeeper - Updates are disabled") and CPU spikes up.
>
> This is a know ZK bug https://issues.apache.org/jira/browse/ZOOKEEPER-2164.
> below are my question:
>
> 1. Solr behaving Wonky when ZK quorom is affected is expected but is there a
> work around?

If ZK loses quorum, Solr goes read-only.  That's how it's designed to
work.  I don't know of any workaround for that.

> 2. The ZK bug is fixed in 3.5.6  is solr7.6 compatible with ZK 3.5.6?

Generally speaking, yes, Solr will work with ZK 3.5.x.  But the ZK
status parts of the admin UI will not work right.  That problem is fixed
in Solr 8.3, which hasn't been released yet.

https://issues.apache.org/jira/browse/SOLR-13672

> 3. Anyone have an opinion around "fast leader election" vs "original
> UDP-based" of ZK. Will cahnge election algorithm version from 3->0 solve the
> issue?

You would have to ask the ZooKeeper mailing list that question.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr behaves wonky when zookeeper quorom is messed up.

harjags
If ZK loses quorum, Solr goes read-only.  That's how it's designed to
work.  I don't know of any workaround for that.

That makes sense. CPU spiking in solr is because solr's index calls are
holding threads because zookeeper is down as per solr?



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html