Solr cloud production set up

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Solr cloud production set up

Rajdeep Sahoo
Hi all,
 We are using solr cloud 7.7.1
In a live production environment how many solr cloud server do we need,
 Currently ,we are using master slave set up with 16 slave server with solr
4.6.
In solr cloud do we need to scale it up or 16 server will suffice the
purpose.
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
Please reply anyone

On Sat, 18 Jan, 2020, 12:13 AM Rajdeep Sahoo, <[hidden email]>
wrote:

> Hi all,
>  We are using solr cloud 7.7.1
> In a live production environment how many solr cloud server do we need,
>  Currently ,we are using master slave set up with 16 slave server with
> solr 4.6.
> In solr cloud do we need to scale it up or 16 server will suffice the
> purpose.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Walter Underwood
Why do you want to change to Solr Cloud? Master/slave is a great, stable cluster architecture.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Jan 17, 2020, at 6:19 PM, Rajdeep Sahoo <[hidden email]> wrote:
>
> Please reply anyone
>
> On Sat, 18 Jan, 2020, 12:13 AM Rajdeep Sahoo, <[hidden email]>
> wrote:
>
>> Hi all,
>> We are using solr cloud 7.7.1
>> In a live production environment how many solr cloud server do we need,
>> Currently ,we are using master slave set up with 16 slave server with
>> solr 4.6.
>> In solr cloud do we need to scale it up or 16 server will suffice the
>> purpose.
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
Our Index size is huge and in master slave the full indexing time is almost
24 hrs.
   In future the no of documents will increase.
So,please some one recommend about the no of nodes and configuration like
ram and cpu core for solr cloud.

On Sat, 18 Jan, 2020, 8:05 AM Walter Underwood, <[hidden email]>
wrote:

> Why do you want to change to Solr Cloud? Master/slave is a great, stable
> cluster architecture.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
> > On Jan 17, 2020, at 6:19 PM, Rajdeep Sahoo <[hidden email]>
> wrote:
> >
> > Please reply anyone
> >
> > On Sat, 18 Jan, 2020, 12:13 AM Rajdeep Sahoo, <
> [hidden email]>
> > wrote:
> >
> >> Hi all,
> >> We are using solr cloud 7.7.1
> >> In a live production environment how many solr cloud server do we need,
> >> Currently ,we are using master slave set up with 16 slave server with
> >> solr 4.6.
> >> In solr cloud do we need to scale it up or 16 server will suffice the
> >> purpose.
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Walter Underwood
How big? We index 35 million documents in about 6 hours.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Jan 18, 2020, at 12:05 AM, Rajdeep Sahoo <[hidden email]> wrote:
>
> Our Index size is huge and in master slave the full indexing time is almost
> 24 hrs.
>   In future the no of documents will increase.
> So,please some one recommend about the no of nodes and configuration like
> ram and cpu core for solr cloud.
>
> On Sat, 18 Jan, 2020, 8:05 AM Walter Underwood, <[hidden email]>
> wrote:
>
>> Why do you want to change to Solr Cloud? Master/slave is a great, stable
>> cluster architecture.
>>
>> wunder
>> Walter Underwood
>> [hidden email]
>> http://observer.wunderwood.org/  (my blog)
>>
>>> On Jan 17, 2020, at 6:19 PM, Rajdeep Sahoo <[hidden email]>
>> wrote:
>>>
>>> Please reply anyone
>>>
>>> On Sat, 18 Jan, 2020, 12:13 AM Rajdeep Sahoo, <
>> [hidden email]>
>>> wrote:
>>>
>>>> Hi all,
>>>> We are using solr cloud 7.7.1
>>>> In a live production environment how many solr cloud server do we need,
>>>> Currently ,we are using master slave set up with 16 slave server with
>>>> solr 4.6.
>>>> In solr cloud do we need to scale it up or 16 server will suffice the
>>>> purpose.
>>>>
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Jörn Franke
In reply to this post by Rajdeep Sahoo
I think you should do your own measurements. This is very document and processing specific.
You can run a test with a simple setup for let’s say 1 mio document and interpolate from this. It could be also that your ETL is the bottleneck and not Solr.
At the same time you can simulate user queries using Jmeter or similar.

> Am 18.01.2020 um 09:05 schrieb Rajdeep Sahoo <[hidden email]>:
>
> Our Index size is huge and in master slave the full indexing time is almost
> 24 hrs.
>   In future the no of documents will increase.
> So,please some one recommend about the no of nodes and configuration like
> ram and cpu core for solr cloud.
>
>> On Sat, 18 Jan, 2020, 8:05 AM Walter Underwood, <[hidden email]>
>> wrote:
>>
>> Why do you want to change to Solr Cloud? Master/slave is a great, stable
>> cluster architecture.
>>
>> wunder
>> Walter Underwood
>> [hidden email]
>> http://observer.wunderwood.org/  (my blog)
>>
>>> On Jan 17, 2020, at 6:19 PM, Rajdeep Sahoo <[hidden email]>
>> wrote:
>>>
>>> Please reply anyone
>>>
>>> On Sat, 18 Jan, 2020, 12:13 AM Rajdeep Sahoo, <
>> [hidden email]>
>>> wrote:
>>>
>>>> Hi all,
>>>> We are using solr cloud 7.7.1
>>>> In a live production environment how many solr cloud server do we need,
>>>> Currently ,we are using master slave set up with 16 slave server with
>>>> solr 4.6.
>>>> In solr cloud do we need to scale it up or 16 server will suffice the
>>>> purpose.
>>>>
>>>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
Got your point.
  If we think about the infra, then in cloud do we need more infra in
comparison to master slave.



On Sat, 18 Jan, 2020, 2:24 PM Jörn Franke, <[hidden email]> wrote:

> I think you should do your own measurements. This is very document and
> processing specific.
> You can run a test with a simple setup for let’s say 1 mio document and
> interpolate from this. It could be also that your ETL is the bottleneck and
> not Solr.
> At the same time you can simulate user queries using Jmeter or similar.
>
> > Am 18.01.2020 um 09:05 schrieb Rajdeep Sahoo <[hidden email]
> >:
> >
> > Our Index size is huge and in master slave the full indexing time is
> almost
> > 24 hrs.
> >   In future the no of documents will increase.
> > So,please some one recommend about the no of nodes and configuration like
> > ram and cpu core for solr cloud.
> >
> >> On Sat, 18 Jan, 2020, 8:05 AM Walter Underwood, <[hidden email]>
> >> wrote:
> >>
> >> Why do you want to change to Solr Cloud? Master/slave is a great, stable
> >> cluster architecture.
> >>
> >> wunder
> >> Walter Underwood
> >> [hidden email]
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>> On Jan 17, 2020, at 6:19 PM, Rajdeep Sahoo <[hidden email]
> >
> >> wrote:
> >>>
> >>> Please reply anyone
> >>>
> >>> On Sat, 18 Jan, 2020, 12:13 AM Rajdeep Sahoo, <
> >> [hidden email]>
> >>> wrote:
> >>>
> >>>> Hi all,
> >>>> We are using solr cloud 7.7.1
> >>>> In a live production environment how many solr cloud server do we
> need,
> >>>> Currently ,we are using master slave set up with 16 slave server with
> >>>> solr 4.6.
> >>>> In solr cloud do we need to scale it up or 16 server will suffice the
> >>>> purpose.
> >>>>
> >>>>
> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Shawn Heisey-2
In reply to this post by Rajdeep Sahoo
On 1/18/2020 1:05 AM, Rajdeep Sahoo wrote:
> Our Index size is huge and in master slave the full indexing time is almost
> 24 hrs.
>     In future the no of documents will increase.
> So,please some one recommend about the no of nodes and configuration like
> ram and cpu core for solr cloud.

Indexing is not going to be any faster in SolrCloud.  It would probably
be a little bit slower.  The best way to speed up indexing, whether
running SolrCloud or not, is to make your indexing processes run in
parallel, so that multiple batches of documents are being indexed at the
same time.

SolrCloud is not a magic bullet that solves all problems.  It's just a
different way of managing indexes that has more automation, and makes
initial setup of a distributed index a lot easier.  It doesn't do the
job any faster than running without SolrCloud.  The legacy master/slave
mode is likely to be a little bit faster.

You haven't provided any of the information required for us to guess
about the system requirements.  And it will be a guess ... we could be
completely wrong.

https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
Hi shawn,
 Thanks for your reply

We do parallel indexing in production,

 What about search performance in solr cloud in comparison with master
slave.
   And what about  block join performance in solr cloud.
   Do we need to increase the infra for solr cloud as we would be
maintaining multiple shard and replica.
  Is there any co relation with master slave set up.




On Sat, 18 Jan, 2020, 10:01 PM Shawn Heisey, <[hidden email]> wrote:

> On 1/18/2020 1:05 AM, Rajdeep Sahoo wrote:
> > Our Index size is huge and in master slave the full indexing time is
> almost
> > 24 hrs.
> >     In future the no of documents will increase.
> > So,please some one recommend about the no of nodes and configuration like
> > ram and cpu core for solr cloud.
>
> Indexing is not going to be any faster in SolrCloud.  It would probably
> be a little bit slower.  The best way to speed up indexing, whether
> running SolrCloud or not, is to make your indexing processes run in
> parallel, so that multiple batches of documents are being indexed at the
> same time.
>
> SolrCloud is not a magic bullet that solves all problems.  It's just a
> different way of managing indexes that has more automation, and makes
> initial setup of a distributed index a lot easier.  It doesn't do the
> job any faster than running without SolrCloud.  The legacy master/slave
> mode is likely to be a little bit faster.
>
> You haven't provided any of the information required for us to guess
> about the system requirements.  And it will be a guess ... we could be
> completely wrong.
>
>
> https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

David Hastings
In reply to this post by Shawn Heisey-2
Agreed with the above. what’s your idea of “huge”? I have 600 ish gb in one core plus another 250x2 in two more on the same standalone solr instance and it runs more than fine

> On Jan 18, 2020, at 11:31 AM, Shawn Heisey <[hidden email]> wrote:
>
> On 1/18/2020 1:05 AM, Rajdeep Sahoo wrote:
>> Our Index size is huge and in master slave the full indexing time is almost
>> 24 hrs.
>>    In future the no of documents will increase.
>> So,please some one recommend about the no of nodes and configuration like
>> ram and cpu core for solr cloud.
>
> Indexing is not going to be any faster in SolrCloud.  It would probably be a little bit slower.  The best way to speed up indexing, whether running SolrCloud or not, is to make your indexing processes run in parallel, so that multiple batches of documents are being indexed at the same time.
>
> SolrCloud is not a magic bullet that solves all problems.  It's just a different way of managing indexes that has more automation, and makes initial setup of a distributed index a lot easier.  It doesn't do the job any faster than running without SolrCloud.  The legacy master/slave mode is likely to be a little bit faster.
>
> You haven't provided any of the information required for us to guess about the system requirements.  And it will be a guess ... we could be completely wrong.
>
> https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> Thanks,
> Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
We are having 2.3 million documents and size is 2.5 gb.
  10 core cpu and 24 gb ram . 16 slave nodes.

  Still some of the queries are taking 50 sec at solr end.
As we are using solr 4.6 .
  Other thing is we are having 200 (avg) facet fields  in a query.
 And 30 searchable fields.
 Is there any way to identify why it is taking 50 sec for a query.
    Multiple concurrent requests are there.



On Sat, 18 Jan, 2020, 10:32 PM Dave, <[hidden email]> wrote:

> Agreed with the above. what’s your idea of “huge”? I have 600 ish gb in
> one core plus another 250x2 in two more on the same standalone solr
> instance and it runs more than fine
>
> > On Jan 18, 2020, at 11:31 AM, Shawn Heisey <[hidden email]> wrote:
> >
> > On 1/18/2020 1:05 AM, Rajdeep Sahoo wrote:
> >> Our Index size is huge and in master slave the full indexing time is
> almost
> >> 24 hrs.
> >>    In future the no of documents will increase.
> >> So,please some one recommend about the no of nodes and configuration
> like
> >> ram and cpu core for solr cloud.
> >
> > Indexing is not going to be any faster in SolrCloud.  It would probably
> be a little bit slower.  The best way to speed up indexing, whether running
> SolrCloud or not, is to make your indexing processes run in parallel, so
> that multiple batches of documents are being indexed at the same time.
> >
> > SolrCloud is not a magic bullet that solves all problems.  It's just a
> different way of managing indexes that has more automation, and makes
> initial setup of a distributed index a lot easier.  It doesn't do the job
> any faster than running without SolrCloud.  The legacy master/slave mode is
> likely to be a little bit faster.
> >
> > You haven't provided any of the information required for us to guess
> about the system requirements.  And it will be a guess ... we could be
> completely wrong.
> >
> >
> https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
> >
> > Thanks,
> > Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Shawn Heisey-2
In reply to this post by Rajdeep Sahoo
On 1/18/2020 9:55 AM, Rajdeep Sahoo wrote:
> We do parallel indexing in production,
>
>   What about search performance in solr cloud in comparison with master
> slave.
>     And what about  block join performance in solr cloud.
>     Do we need to increase the infra for solr cloud as we would be
> maintaining multiple shard and replica.
>    Is there any co relation with master slave set up.

As I said before, SolrCloud is not a magic bullet that solves
performance issues.  If the index characteristics are the same (number
of docs, total size), performance in SolrCloud will be nearly identical
to non-cloud.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
Hi shawn,
  Thanks for this info,
Could you Please address my below query,


We are having 2.3 million documents and size is 2.5 gb.
 With this data do we need solr cloud.

  10 core cpu and 24 gb ram . 16 slave nodes.

  Still some of the queries are taking 50 sec at solr end.
As we are using solr 4.6 .
  Other thing is we are having 200 (avg) facet fields  in a query.
 And 30 searchable fields.
 Is there any way to identify why it is taking 50 sec for a query.
    Multiple concurrent requests are there.

And how to optimize the search response time as it is almost 1 mins for
some request.


On Sat, 18 Jan, 2020, 10:52 PM Shawn Heisey, <[hidden email]> wrote:

> On 1/18/2020 9:55 AM, Rajdeep Sahoo wrote:
> > We do parallel indexing in production,
> >
> >   What about search performance in solr cloud in comparison with master
> > slave.
> >     And what about  block join performance in solr cloud.
> >     Do we need to increase the infra for solr cloud as we would be
> > maintaining multiple shard and replica.
> >    Is there any co relation with master slave set up.
>
> As I said before, SolrCloud is not a magic bullet that solves
> performance issues.  If the index characteristics are the same (number
> of docs, total size), performance in SolrCloud will be nearly identical
> to non-cloud.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Walter Underwood
For indexing, is the master node CPU around 90%? If not, you aren’t sending requests fast enough or your disk is slow.

For querying, 200 facet fields is HUGE. That will take a lot of Java heap memory and will be slow. Each facet fields requires large in-memory arrays and sorting.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Jan 18, 2020, at 9:29 AM, Rajdeep Sahoo <[hidden email]> wrote:
>
> Hi shawn,
>  Thanks for this info,
> Could you Please address my below query,
>
>
> We are having 2.3 million documents and size is 2.5 gb.
> With this data do we need solr cloud.
>
>  10 core cpu and 24 gb ram . 16 slave nodes.
>
>  Still some of the queries are taking 50 sec at solr end.
> As we are using solr 4.6 .
>  Other thing is we are having 200 (avg) facet fields  in a query.
> And 30 searchable fields.
> Is there any way to identify why it is taking 50 sec for a query.
>    Multiple concurrent requests are there.
>
> And how to optimize the search response time as it is almost 1 mins for
> some request.
>
>
> On Sat, 18 Jan, 2020, 10:52 PM Shawn Heisey, <[hidden email]> wrote:
>
>> On 1/18/2020 9:55 AM, Rajdeep Sahoo wrote:
>>> We do parallel indexing in production,
>>>
>>>  What about search performance in solr cloud in comparison with master
>>> slave.
>>>    And what about  block join performance in solr cloud.
>>>    Do we need to increase the infra for solr cloud as we would be
>>> maintaining multiple shard and replica.
>>>   Is there any co relation with master slave set up.
>>
>> As I said before, SolrCloud is not a magic bullet that solves
>> performance issues.  If the index characteristics are the same (number
>> of docs, total size), performance in SolrCloud will be nearly identical
>> to non-cloud.
>>
>> Thanks,
>> Shawn
>>

Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
Although we are having a avg of 200 facet fields in the search request all
of them will not be having values in each request.
    Max of 50-60 facet fields will be having some value.
  And we are using function query,is it having some performance impact.


On Sat, 18 Jan, 2020, 11:10 PM Walter Underwood, <[hidden email]>
wrote:

> For indexing, is the master node CPU around 90%? If not, you aren’t
> sending requests fast enough or your disk is slow.
>
> For querying, 200 facet fields is HUGE. That will take a lot of Java heap
> memory and will be slow. Each facet fields requires large in-memory arrays
> and sorting.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
> > On Jan 18, 2020, at 9:29 AM, Rajdeep Sahoo <[hidden email]>
> wrote:
> >
> > Hi shawn,
> >  Thanks for this info,
> > Could you Please address my below query,
> >
> >
> > We are having 2.3 million documents and size is 2.5 gb.
> > With this data do we need solr cloud.
> >
> >  10 core cpu and 24 gb ram . 16 slave nodes.
> >
> >  Still some of the queries are taking 50 sec at solr end.
> > As we are using solr 4.6 .
> >  Other thing is we are having 200 (avg) facet fields  in a query.
> > And 30 searchable fields.
> > Is there any way to identify why it is taking 50 sec for a query.
> >    Multiple concurrent requests are there.
> >
> > And how to optimize the search response time as it is almost 1 mins for
> > some request.
> >
> >
> > On Sat, 18 Jan, 2020, 10:52 PM Shawn Heisey, <[hidden email]>
> wrote:
> >
> >> On 1/18/2020 9:55 AM, Rajdeep Sahoo wrote:
> >>> We do parallel indexing in production,
> >>>
> >>>  What about search performance in solr cloud in comparison with master
> >>> slave.
> >>>    And what about  block join performance in solr cloud.
> >>>    Do we need to increase the infra for solr cloud as we would be
> >>> maintaining multiple shard and replica.
> >>>   Is there any co relation with master slave set up.
> >>
> >> As I said before, SolrCloud is not a magic bullet that solves
> >> performance issues.  If the index characteristics are the same (number
> >> of docs, total size), performance in SolrCloud will be nearly identical
> >> to non-cloud.
> >>
> >> Thanks,
> >> Shawn
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Shawn Heisey-2
In reply to this post by Rajdeep Sahoo
On 1/18/2020 10:09 AM, Rajdeep Sahoo wrote:
> We are having 2.3 million documents and size is 2.5 gb.
>    10 core cpu and 24 gb ram . 16 slave nodes.
>
>    Still some of the queries are taking 50 sec at solr end.
> As we are using solr 4.6 .
>    Other thing is we are having 200 (avg) facet fields  in a query.
>   And 30 searchable fields.
>   Is there any way to identify why it is taking 50 sec for a query.
>      Multiple concurrent requests are there.

Searching 30 fields and computing 200 facets is never going to be super
fast.  Switching to cloud will not help, and might make it slower.

Your index is pretty small to a lot of us.  There are people running
indexes with billions of documents that take terabytes of disk space.

As Walter mentioned, computing 200 facets is going to require a fair
amount of heap memory.  One *possible* problem here is that the Solr
heap size is too small, so a lot of GC is required.  How much of the
24GB have you assigned to the heap?  Is there any software other than
Solr running on these nodes?

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
We have assigned 16 gb out of 24gb for heap .
 No other process is running on that node.

200 facets fields are there in the query but we will not be getting the
values for each facets for every search.
There can be max of 50-60 facets for which we will be getting values.

 We are using caching,is it not going to help.



On Sat, 18 Jan, 2020, 11:36 PM Shawn Heisey, <[hidden email]> wrote:

> On 1/18/2020 10:09 AM, Rajdeep Sahoo wrote:
> > We are having 2.3 million documents and size is 2.5 gb.
> >    10 core cpu and 24 gb ram . 16 slave nodes.
> >
> >    Still some of the queries are taking 50 sec at solr end.
> > As we are using solr 4.6 .
> >    Other thing is we are having 200 (avg) facet fields  in a query.
> >   And 30 searchable fields.
> >   Is there any way to identify why it is taking 50 sec for a query.
> >      Multiple concurrent requests are there.
>
> Searching 30 fields and computing 200 facets is never going to be super
> fast.  Switching to cloud will not help, and might make it slower.
>
> Your index is pretty small to a lot of us.  There are people running
> indexes with billions of documents that take terabytes of disk space.
>
> As Walter mentioned, computing 200 facets is going to require a fair
> amount of heap memory.  One *possible* problem here is that the Solr
> heap size is too small, so a lot of GC is required.  How much of the
> 24GB have you assigned to the heap?  Is there any software other than
> Solr running on these nodes?
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

David Hastings
If you’re not getting values, don’t ask for the facet. Facets are expensive as hell, maybe you should think more about your query’s than your infrastructure, solr cloud won’t help you at all especially if your asking for things you don’t need

> On Jan 18, 2020, at 1:25 PM, Rajdeep Sahoo <[hidden email]> wrote:
>
> We have assigned 16 gb out of 24gb for heap .
> No other process is running on that node.
>
> 200 facets fields are there in the query but we will not be getting the
> values for each facets for every search.
> There can be max of 50-60 facets for which we will be getting values.
>
> We are using caching,is it not going to help.
>
>
>
>> On Sat, 18 Jan, 2020, 11:36 PM Shawn Heisey, <[hidden email]> wrote:
>>
>>> On 1/18/2020 10:09 AM, Rajdeep Sahoo wrote:
>>> We are having 2.3 million documents and size is 2.5 gb.
>>>   10 core cpu and 24 gb ram . 16 slave nodes.
>>>
>>>   Still some of the queries are taking 50 sec at solr end.
>>> As we are using solr 4.6 .
>>>   Other thing is we are having 200 (avg) facet fields  in a query.
>>>  And 30 searchable fields.
>>>  Is there any way to identify why it is taking 50 sec for a query.
>>>     Multiple concurrent requests are there.
>>
>> Searching 30 fields and computing 200 facets is never going to be super
>> fast.  Switching to cloud will not help, and might make it slower.
>>
>> Your index is pretty small to a lot of us.  There are people running
>> indexes with billions of documents that take terabytes of disk space.
>>
>> As Walter mentioned, computing 200 facets is going to require a fair
>> amount of heap memory.  One *possible* problem here is that the Solr
>> heap size is too small, so a lot of GC is required.  How much of the
>> 24GB have you assigned to the heap?  Is there any software other than
>> Solr running on these nodes?
>>
>> Thanks,
>> Shawn
>>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Rajdeep Sahoo
Thanks for the suggestion,

 Is there any way to get the info which operation or which query params are
increasing the response time.


On Sat, 18 Jan, 2020, 11:59 PM Dave, <[hidden email]> wrote:

> If you’re not getting values, don’t ask for the facet. Facets are
> expensive as hell, maybe you should think more about your query’s than your
> infrastructure, solr cloud won’t help you at all especially if your asking
> for things you don’t need
>
> > On Jan 18, 2020, at 1:25 PM, Rajdeep Sahoo <[hidden email]>
> wrote:
> >
> > We have assigned 16 gb out of 24gb for heap .
> > No other process is running on that node.
> >
> > 200 facets fields are there in the query but we will not be getting the
> > values for each facets for every search.
> > There can be max of 50-60 facets for which we will be getting values.
> >
> > We are using caching,is it not going to help.
> >
> >
> >
> >> On Sat, 18 Jan, 2020, 11:36 PM Shawn Heisey, <[hidden email]>
> wrote:
> >>
> >>> On 1/18/2020 10:09 AM, Rajdeep Sahoo wrote:
> >>> We are having 2.3 million documents and size is 2.5 gb.
> >>>   10 core cpu and 24 gb ram . 16 slave nodes.
> >>>
> >>>   Still some of the queries are taking 50 sec at solr end.
> >>> As we are using solr 4.6 .
> >>>   Other thing is we are having 200 (avg) facet fields  in a query.
> >>>  And 30 searchable fields.
> >>>  Is there any way to identify why it is taking 50 sec for a query.
> >>>     Multiple concurrent requests are there.
> >>
> >> Searching 30 fields and computing 200 facets is never going to be super
> >> fast.  Switching to cloud will not help, and might make it slower.
> >>
> >> Your index is pretty small to a lot of us.  There are people running
> >> indexes with billions of documents that take terabytes of disk space.
> >>
> >> As Walter mentioned, computing 200 facets is going to require a fair
> >> amount of heap memory.  One *possible* problem here is that the Solr
> >> heap size is too small, so a lot of GC is required.  How much of the
> >> 24GB have you assigned to the heap?  Is there any software other than
> >> Solr running on these nodes?
> >>
> >> Thanks,
> >> Shawn
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud production set up

Erick Erickson
Add &debug=timing to the query and it’ll show you the time each component takes.

> On Jan 18, 2020, at 1:50 PM, Rajdeep Sahoo <[hidden email]> wrote:
>
> Thanks for the suggestion,
>
> Is there any way to get the info which operation or which query params are
> increasing the response time.
>
>
> On Sat, 18 Jan, 2020, 11:59 PM Dave, <[hidden email]> wrote:
>
>> If you’re not getting values, don’t ask for the facet. Facets are
>> expensive as hell, maybe you should think more about your query’s than your
>> infrastructure, solr cloud won’t help you at all especially if your asking
>> for things you don’t need
>>
>>> On Jan 18, 2020, at 1:25 PM, Rajdeep Sahoo <[hidden email]>
>> wrote:
>>>
>>> We have assigned 16 gb out of 24gb for heap .
>>> No other process is running on that node.
>>>
>>> 200 facets fields are there in the query but we will not be getting the
>>> values for each facets for every search.
>>> There can be max of 50-60 facets for which we will be getting values.
>>>
>>> We are using caching,is it not going to help.
>>>
>>>
>>>
>>>> On Sat, 18 Jan, 2020, 11:36 PM Shawn Heisey, <[hidden email]>
>> wrote:
>>>>
>>>>> On 1/18/2020 10:09 AM, Rajdeep Sahoo wrote:
>>>>> We are having 2.3 million documents and size is 2.5 gb.
>>>>>  10 core cpu and 24 gb ram . 16 slave nodes.
>>>>>
>>>>>  Still some of the queries are taking 50 sec at solr end.
>>>>> As we are using solr 4.6 .
>>>>>  Other thing is we are having 200 (avg) facet fields  in a query.
>>>>> And 30 searchable fields.
>>>>> Is there any way to identify why it is taking 50 sec for a query.
>>>>>    Multiple concurrent requests are there.
>>>>
>>>> Searching 30 fields and computing 200 facets is never going to be super
>>>> fast.  Switching to cloud will not help, and might make it slower.
>>>>
>>>> Your index is pretty small to a lot of us.  There are people running
>>>> indexes with billions of documents that take terabytes of disk space.
>>>>
>>>> As Walter mentioned, computing 200 facets is going to require a fair
>>>> amount of heap memory.  One *possible* problem here is that the Solr
>>>> heap size is too small, so a lot of GC is required.  How much of the
>>>> 24GB have you assigned to the heap?  Is there any software other than
>>>> Solr running on these nodes?
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>

12