Can replicas retry for a longer time?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Can replicas retry for a longer time?

Walter Underwood
We had a bad situation with our prod cluster. There was a DNS failure in AWS and all the replicas went “brown”. Only the leaders were taking traffic.

If the replicas had continued to attempt recovery every five minutes or so, they would have come back online automatically.

Is there a way to configure that?

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

Reply | Threaded
Open this post in threaded view
|

Re: Can replicas retry for a longer time?

Walter Underwood
Hello! Any response on this?

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Apr 11, 2019, at 7:46 PM, Walter Underwood <[hidden email]> wrote:
>
> We had a bad situation with our prod cluster. There was a DNS failure in AWS and all the replicas went “brown”. Only the leaders were taking traffic.
>
> If the replicas had continued to attempt recovery every five minutes or so, they would have come back online automatically.
>
> Is there a way to configure that?
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>

Reply | Threaded
Open this post in threaded view
|

Re: Can replicas retry for a longer time?

Shalin Shekhar Mangar
Assuming this is SolrCloud, the replicas retry upto 500 times and wait an
exponentially increasing time between retries starting from 2 seconds and
upto a minute. Did that not happen in your case? Did the JVM had DNS ttls
set?

On Sun, Apr 28, 2019 at 7:51 PM Walter Underwood <[hidden email]>
wrote:

> Hello! Any response on this?
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
> > On Apr 11, 2019, at 7:46 PM, Walter Underwood <[hidden email]>
> wrote:
> >
> > We had a bad situation with our prod cluster. There was a DNS failure in
> AWS and all the replicas went “brown”. Only the leaders were taking traffic.
> >
> > If the replicas had continued to attempt recovery every five minutes or
> so, they would have come back online automatically.
> >
> > Is there a way to configure that?
> >
> > wunder
> > Walter Underwood
> > [hidden email]
> > http://observer.wunderwood.org/  (my blog)
> >
>
>

--
Regards,
Shalin Shekhar Mangar.