SOLR zookeeper connection timeout during startup is hardcoded to 10000ms

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

SOLR zookeeper connection timeout during startup is hardcoded to 10000ms

dshih
Hi,

During startup in cloud mode, the SOLR zookeeper connection timeout appears to be hardcoded to 1000ms:
https://github.com/apache/lucene-solr/blob/5eab1c3c688a0d8db650c657567f197fb3dcf181/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L45

And it is not configurable via zkClientTimeout (solr.xml) or SOLR_WAIT_FOR_ZK (solr.in.sh).

Is there a way to configure this, and if not, should I open a bug?

Thanks,
Danny
Reply | Threaded
Open this post in threaded view
|

Re: SOLR zookeeper connection timeout during startup is hardcoded to 10000ms

Erick Erickson
That's actually 10,000 ms, a typo in your message?

Do you have a situation where that setting is causing you trouble?
Because 10 seconds for communications with ZK is quite a long time,
I'm curious what the circumstances are that you're seeing.

Best,
Erick

On Wed, Aug 22, 2018 at 3:51 PM, Danny Shih <[hidden email]> wrote:

> Hi,
>
> During startup in cloud mode, the SOLR zookeeper connection timeout appears to be hardcoded to 1000ms:
> https://github.com/apache/lucene-solr/blob/5eab1c3c688a0d8db650c657567f197fb3dcf181/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L45
>
> And it is not configurable via zkClientTimeout (solr.xml) or SOLR_WAIT_FOR_ZK (solr.in.sh).
>
> Is there a way to configure this, and if not, should I open a bug?
>
> Thanks,
> Danny
Reply | Threaded
Open this post in threaded view
|

Re: SOLR zookeeper connection timeout during startup is hardcoded to 10000ms

dshih
Sorry, yes 10,000 ms.

We have a single test cluster (out of probably hundreds) where one node hits
this consistently.  I'm not sure what kind of issues (network?) that node is
having.

Generally though, we ship SOLR as part of our product, and we cannot control
our customers' hardware and setup besides listing minimum requirements.
While I think this issue will probably be extremely rare, we would
definitely prefer to be able to say: "well, if you can't fix your hardware
issue, try increasing this timeout setting".

Thanks,
Danny



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: SOLR zookeeper connection timeout during startup is hardcoded to 10000ms

Dominique Bejean
Hi,

We also experimenting time-out issues from time to time.

I sent this message one month ago, by mistake in the dev list.

Why use hardcoded values just in ZkClientClusterStateProvider.java file
while there are existing parameters for these time-out ?

Regards

Dominique

================================================

We are experimenting an issue related to Zk Timeout

Stacktrace is :

ERROR 19 juin 2018 06:24:07,152 - h.concurrent.ConcurrentService:67   -
Erreur dans l'attente de la fin de l'exécution d'un thread
ERROR 19 juin 2018 06:24:07,152 - h.concurrent.ConcurrentService:68   -
org.apache.solr.common.SolrException:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper
xxx.xxx.xxx.xxx  :2181 within 10000 ms
ERROR 19 juin 2018 06:24:07,152 -           api.batch.Lanceur:98   -
org.apache.solr.common.SolrException:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper
xxx.xxx.xxx.xxx  :2181 within 10000 ms
java.util.concurrent.ExecutionException:
org.apache.solr.common.SolrException:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper
xxx.xxx.xxx.xxx:2181 within 10000 ms
 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
 ...
Caused by: org.apache.solr.common.SolrException:
java.util.concurrent.TimeoutException: Could not connect to ZooKeeper
xxx.xxx.xxx.xxx:2181 within 10000 ms
 at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:182)
 at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:116)
 at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:106)
 at
org.apache.solr.common.cloud.ZkStateReader.<init>(ZkStateReader.java:226)
 at
org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.connect(ZkClientClusterStateProvider.java:121)
...


In solr.xml, we have :
    <int name="zkClientTimeout">${zkClientTimeout:30000}</int>

In Solr.in.sh <http://solr.in.sh/>, we have :
#ZK_CLIENT_TIMEOUT="15000"
or
ZK_CLIENT_TIMEOUT="30000"

So zkClientTimeout  should be 30000.

In source code of ZkClientClusterStateProvider.java, I see zkClientTimeout
is hardcoded to 10000 ! Is it normal that configuration is not used ?

lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java

int zkConnectTimeout = 10000;
int zkClientTimeout = 10000;

...

zk = new ZkStateReader(zkHost, zkClientTimeout, zkConnectTimeout);


Regards.

Le ven. 24 août 2018 à 20:15, dshih <[hidden email]> a écrit :

> Sorry, yes 10,000 ms.
>
> We have a single test cluster (out of probably hundreds) where one node
> hits
> this consistently.  I'm not sure what kind of issues (network?) that node
> is
> having.
>
> Generally though, we ship SOLR as part of our product, and we cannot
> control
> our customers' hardware and setup besides listing minimum requirements.
> While I think this issue will probably be extremely rare, we would
> definitely prefer to be able to say: "well, if you can't fix your hardware
> issue, try increasing this timeout setting".
>
> Thanks,
> Danny
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
Reply | Threaded
Open this post in threaded view
|

Re: SOLR zookeeper connection timeout during startup is hardcoded to 10000ms

Ashish Bisht
Can this timeout value be changed .

Regards
Ashish



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html