In production solr cloud 4.6 is not working

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

In production solr cloud 4.6 is not working

Rajdeep Sahoo
Hi all,
  In production we have done the set up of solr cloud with solr version 4.6
, the set up contains  3 zookeeper and 4 shards each having three replicas,
a total of 12 solr nodes.
Active indexing was going on , after switching on we are experiencing a lot
of issues , all the nodes stopped serving the search requests and in log it
is showing recovery failing , and can see Full GC for 2 shards.
  After graceful restart , the recovery failing issue got resolved but
currently we are using master slave with 16 slave nodes and 1 leader , it
is working fine.
  Do we need to scale it up in solr cloud? Please suggest as it is a
production env. , I guess all you can understand the impact of it.

Thanks in advance
Reply | Threaded
Open this post in threaded view
|

Re: In production solr cloud 4.6 is not working

Rajdeep Sahoo
Hi ,

Anyone Please suggest.

On Sat, 11 Jan, 2020, 12:33 AM Rajdeep Sahoo, <[hidden email]>
wrote:

> Hi all,
>   In production we have done the set up of solr cloud with solr version
> 4.6 , the set up contains  3 zookeeper and 4 shards each having three
> replicas, a total of 12 solr nodes.
> Active indexing was going on , after switching on we are experiencing a
> lot of issues , all the nodes stopped serving the search requests and in
> log it is showing recovery failing , and can see Full GC for 2 shards.
>   After graceful restart , the recovery failing issue got resolved but
> currently we are using master slave with 16 slave nodes and 1 leader , it
> is working fine.
>   Do we need to scale it up in solr cloud? Please suggest as it is a
> production env. , I guess all you can understand the impact of it.
>
> Thanks in advance
>
Reply | Threaded
Open this post in threaded view
|

Re: In production solr cloud 4.6 is not working

Erick Erickson
You’ve provided no details, nor relayed any findings from looking
at the Solr logs. In short, there’s not enough information here to
provide any helpful response.

Full GCs are normal, but if they’re long enough
to exceed certain timeouts, they can trigger recoveries. Solr 4.6 had
a number of conditions that can lead to this this, but Solr 4.6 is
over 6 years old. There’s going to be little help available at this point.

You might want to review: https://wiki.apache.org/solr/UsingMailingLists


> On Jan 11, 2020, at 2:47 AM, Rajdeep Sahoo <[hidden email]> wrote:
>
> Hi ,
>
> Anyone Please suggest.
>
> On Sat, 11 Jan, 2020, 12:33 AM Rajdeep Sahoo, <[hidden email]>
> wrote:
>
>> Hi all,
>>  In production we have done the set up of solr cloud with solr version
>> 4.6 , the set up contains  3 zookeeper and 4 shards each having three
>> replicas, a total of 12 solr nodes.
>> Active indexing was going on , after switching on we are experiencing a
>> lot of issues , all the nodes stopped serving the search requests and in
>> log it is showing recovery failing , and can see Full GC for 2 shards.
>>  After graceful restart , the recovery failing issue got resolved but
>> currently we are using master slave with 16 slave nodes and 1 leader , it
>> is working fine.
>>  Do we need to scale it up in solr cloud? Please suggest as it is a
>> production env. , I guess all you can understand the impact of it.
>>
>> Thanks in advance
>>

Reply | Threaded
Open this post in threaded view
|

Re: In production solr cloud 4.6 is not working

Rajdeep Sahoo
Nodes were busy with full Gc over 15 sec.
In solr console it was showing no server hosting shard,status code 500.
In solr log the same error was there i.e. solrexception: No server hosting
shard.

Apart from this,in the zoo console it was showing recovery failing status
of the nodes.

Thanks in advance

On Sat, 11 Jan, 2020, 7:15 PM Erick Erickson, <[hidden email]>
wrote:

> You’ve provided no details, nor relayed any findings from looking
> at the Solr logs. In short, there’s not enough information here to
> provide any helpful response.
>
> Full GCs are normal, but if they’re long enough
> to exceed certain timeouts, they can trigger recoveries. Solr 4.6 had
> a number of conditions that can lead to this this, but Solr 4.6 is
> over 6 years old. There’s going to be little help available at this point.
>
> You might want to review: https://wiki.apache.org/solr/UsingMailingLists
>
>
> > On Jan 11, 2020, at 2:47 AM, Rajdeep Sahoo <[hidden email]>
> wrote:
> >
> > Hi ,
> >
> > Anyone Please suggest.
> >
> > On Sat, 11 Jan, 2020, 12:33 AM Rajdeep Sahoo, <
> [hidden email]>
> > wrote:
> >
> >> Hi all,
> >>  In production we have done the set up of solr cloud with solr version
> >> 4.6 , the set up contains  3 zookeeper and 4 shards each having three
> >> replicas, a total of 12 solr nodes.
> >> Active indexing was going on , after switching on we are experiencing a
> >> lot of issues , all the nodes stopped serving the search requests and in
> >> log it is showing recovery failing , and can see Full GC for 2 shards.
> >>  After graceful restart , the recovery failing issue got resolved but
> >> currently we are using master slave with 16 slave nodes and 1 leader ,
> it
> >> is working fine.
> >>  Do we need to scale it up in solr cloud? Please suggest as it is a
> >> production env. , I guess all you can understand the impact of it.
> >>
> >> Thanks in advance
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: In production solr cloud 4.6 is not working

Erick Erickson
15 second full GC pauses are far too long. I’d recommend you concentrate on sizing
your hardware/heap setting to minimize the full GC pauses.

Here’s a good place to start:
https://cwiki.apache.org/confluence/display/SOLR/ShawnHeisey

Best,
Erick

> On Jan 11, 2020, at 9:14 AM, Rajdeep Sahoo <[hidden email]> wrote:
>
> Nodes were busy with full Gc over 15 sec.
> In solr console it was showing no server hosting shard,status code 500.
> In solr log the same error was there i.e. solrexception: No server hosting
> shard.
>
> Apart from this,in the zoo console it was showing recovery failing status
> of the nodes.
>
> Thanks in advance
>
> On Sat, 11 Jan, 2020, 7:15 PM Erick Erickson, <[hidden email]>
> wrote:
>
>> You’ve provided no details, nor relayed any findings from looking
>> at the Solr logs. In short, there’s not enough information here to
>> provide any helpful response.
>>
>> Full GCs are normal, but if they’re long enough
>> to exceed certain timeouts, they can trigger recoveries. Solr 4.6 had
>> a number of conditions that can lead to this this, but Solr 4.6 is
>> over 6 years old. There’s going to be little help available at this point.
>>
>> You might want to review: https://wiki.apache.org/solr/UsingMailingLists
>>
>>
>>> On Jan 11, 2020, at 2:47 AM, Rajdeep Sahoo <[hidden email]>
>> wrote:
>>>
>>> Hi ,
>>>
>>> Anyone Please suggest.
>>>
>>> On Sat, 11 Jan, 2020, 12:33 AM Rajdeep Sahoo, <
>> [hidden email]>
>>> wrote:
>>>
>>>> Hi all,
>>>> In production we have done the set up of solr cloud with solr version
>>>> 4.6 , the set up contains  3 zookeeper and 4 shards each having three
>>>> replicas, a total of 12 solr nodes.
>>>> Active indexing was going on , after switching on we are experiencing a
>>>> lot of issues , all the nodes stopped serving the search requests and in
>>>> log it is showing recovery failing , and can see Full GC for 2 shards.
>>>> After graceful restart , the recovery failing issue got resolved but
>>>> currently we are using master slave with 16 slave nodes and 1 leader ,
>> it
>>>> is working fine.
>>>> Do we need to scale it up in solr cloud? Please suggest as it is a
>>>> production env. , I guess all you can understand the impact of it.
>>>>
>>>> Thanks in advance
>>>>
>>
>>