very high query time on solr due to high CPU usage

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

very high query time on solr due to high CPU usage

Saurabh Sharma
Hi All ,

I have been observing a very unique pattern in our solr resource usage.
I am running a cluster with 3 nodes and RAM on each node is 12GB.
We are doing hard commits every 1 minute and soft commits every 15 seconds.
Under normal circumstances solr response time is ~15 ms and CPU usage of
around 200% (~2 cores) . But we have few instances where CPU load suddenly
increase to around 2000%(~ 20 cores) and response time increase to 1000ms
.At the time this situation happen , almost every query start taking time
and CPU usage keeps on increasing for around 30 mins and even restart don't
help. RAM usage remains constant for this duration.Do solr start behaving
like this under high traffic situations ? what can be the possible reasons
of very high CPU usage in real time solr.

I am using solr 7.3.1 and java 8 .

Thanks
Saurabh
Reply | Threaded
Open this post in threaded view
|

Re: very high query time on solr due to high CPU usage

Shawn Heisey-2
On 5/11/2019 8:05 AM, Saurabh Sharma wrote:

> I have been observing a very unique pattern in our solr resource usage.
> I am running a cluster with 3 nodes and RAM on each node is 12GB.
> We are doing hard commits every 1 minute and soft commits every 15 seconds.
> Under normal circumstances solr response time is ~15 ms and CPU usage of
> around 200% (~2 cores) . But we have few instances where CPU load suddenly
> increase to around 2000%(~ 20 cores) and response time increase to 1000ms
> .At the time this situation happen , almost every query start taking time
> and CPU usage keeps on increasing for around 30 mins and even restart don't
> help. RAM usage remains constant for this duration.Do solr start behaving
> like this under high traffic situations ? what can be the possible reasons
> of very high CPU usage in real time solr.

If you get the maxDoc number from every core (index) in that Solr
instance, and add those numbers up, you'll get a total document count
for the whole node.  What are those numbers?

How much disk space do all the indexes take?

What is Solr's max heap size?  Are there any other programs running on
that node other than the one Solr instance?  This would include multiple
Solr instances.

If you can get the screenshot mentioned at the link below, that can
reveal some of the information I have asked for above, but not all of it.

https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue

What query rate is Solr handling?

Do you have multiple Solr servers for this use, or just one?

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: very high query time on solr due to high CPU usage

Saurabh Sharma
Hi Shwan,

I am providing the data asked .In case any thing else is required please
let me know.

*If you get the maxDoc number from every core (index) in that Solr *
*instance, and add those numbers up, you'll get a total document count *
*for the whole node.  What are those numbers?*

-This solr server is running single Core.This cloud instance is having 3
servers on different machines with maxshardperNode set to 1 and having a
replication factor of 3.
Full collection is present on all 3 nodes.I have checked max docs on every
node and they were around 1.5 million on each node with 0.9 Millions active
records.


*How much disk space do all the indexes take?*
-> index size is around 2GB/per node.

*What is Solr's max heap size?  Are there any other programs running on *
*that node other than the one Solr instance?  This would include multiple *
*Solr instances.*
->Maximum heap size for solr is set to 12GB but each node generally take 3
to 5 GB.These solr instances are hosted on servers where we are running
other services too.
each machine on which we are running solr is having 64GB of RAM and 24 core
cpu.During the peek CPU usage i have seen solr consuming 2000% cpu that
causes issue.

*If you can get the screenshot mentioned at the link below, that can *
*reveal some of the information I have asked for above, but not all of it.*

*https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue
<https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue>*
-> What specific information is required ?


*What query rate is Solr handling?*
-> 80-100 query per second on each solr node during peek time.It sums up to
250-300 request on 3 replicas of same index.


*Do you have multiple Solr servers for this use, or just one?*
-> We do use single solr cluster with three nodes in it.


I generally face issue once in every few days and one thing that is common
on all such happenings is high traffic .I am suspecting that due to dynamic
nature of index.during high traffic , cache miss iincreases and solr start
doing run time comuptations resulting in high CPU usage.I am not being able
to find any other explanation as of now.


Thanks
Saurabh Sharma

On Sat, May 11, 2019 at 8:58 PM Shawn Heisey <[hidden email]> wrote:

> On 5/11/2019 8:05 AM, Saurabh Sharma wrote:
> > I have been observing a very unique pattern in our solr resource usage.
> > I am running a cluster with 3 nodes and RAM on each node is 12GB.
> > We are doing hard commits every 1 minute and soft commits every 15
> seconds.
> > Under normal circumstances solr response time is ~15 ms and CPU usage of
> > around 200% (~2 cores) . But we have few instances where CPU load
> suddenly
> > increase to around 2000%(~ 20 cores) and response time increase to 1000ms
> > .At the time this situation happen , almost every query start taking time
> > and CPU usage keeps on increasing for around 30 mins and even restart
> don't
> > help. RAM usage remains constant for this duration.Do solr start behaving
> > like this under high traffic situations ? what can be the possible
> reasons
> > of very high CPU usage in real time solr.
>
> If you get the maxDoc number from every core (index) in that Solr
> instance, and add those numbers up, you'll get a total document count
> for the whole node.  What are those numbers?
>
> How much disk space do all the indexes take?
>
> What is Solr's max heap size?  Are there any other programs running on
> that node other than the one Solr instance?  This would include multiple
> Solr instances.
>
> If you can get the screenshot mentioned at the link below, that can
> reveal some of the information I have asked for above, but not all of it.
>
>
> https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue
>
> What query rate is Solr handling?
>
> Do you have multiple Solr servers for this use, or just one?
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: very high query time on solr due to high CPU usage

Shawn Heisey-2
On 5/11/2019 12:49 PM, Saurabh Sharma wrote:

> Full collection is present on all 3 nodes.I have checked max docs on every
> node and they were around 1.5 million on each node with 0.9 Millions active
> records.
>
> *How much disk space do all the indexes take?*
> -> index size is around 2GB/per node.
>
> *What is Solr's max heap size?  Are there any other programs running on *
> *that node other than the one Solr instance?  This would include multiple *
> *Solr instances.*
> ->Maximum heap size for solr is set to 12GB but each node generally take 3
> to 5 GB.These solr instances are hosted on servers where we are running
> other services too.
> each machine on which we are running solr is having 64GB of RAM and 24 core
> cpu.During the peek CPU usage i have seen solr consuming 2000% cpu that
> causes issue.

A 12GB heap seems excessive for 1.5 million docs taking up 2GB of space,
unless you are running extremely resource intensive queries -- facets or
grouping on high cardinality fields, for instance.

With other programs on the server, the systems memory may not be fully
available for the operating system to cache the index data.  That is the
secret to good Solr performance -- getting relevant parts of the index
into the OS disk cache so that Solr doesn't need to actually read the
data off the disk.

> *If you can get the screenshot mentioned at the link below, that can *
> *reveal some of the information I have asked for above, but not all of it.*
>
> *https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue
> <https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue>*
> -> What specific information is required ?

The specific information required is the screenshot of a process listing
as described at that page.  You'll need to use a file sharing site, as
this mailing list typically eats email attachments.  This screenshot
provides a very good overview of the system that we can use to determine
whether we expect good performance.

> *What query rate is Solr handling?*
> -> 80-100 query per second on each solr node during peek time.It sums up to
> 250-300 request on 3 replicas of same index.

That's a very high query rate.  It will be even more important for the
system to have the index data in memory.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: very high query time on solr due to high CPU usage

Saurabh Sharma
Hi Shawn,

I again faced the issue and restarting the leader worked for me this time.
Please find attached the top command for further insights.

First java process in screenshot is solr.

Can it be a possibility that there are some issue with this particular node?

Looking forward to hearing from you.

Regards
Saurabh 

On Sun, May 12, 2019 at 2:51 AM Shawn Heisey <[hidden email]> wrote:
On 5/11/2019 12:49 PM, Saurabh Sharma wrote:
> Full collection is present on all 3 nodes.I have checked max docs on every
> node and they were around 1.5 million on each node with 0.9 Millions active
> records.
>
> *How much disk space do all the indexes take?*
> -> index size is around 2GB/per node.
>
> *What is Solr's max heap size?  Are there any other programs running on *
> *that node other than the one Solr instance?  This would include multiple *
> *Solr instances.*
> ->Maximum heap size for solr is set to 12GB but each node generally take 3
> to 5 GB.These solr instances are hosted on servers where we are running
> other services too.
> each machine on which we are running solr is having 64GB of RAM and 24 core
> cpu.During the peek CPU usage i have seen solr consuming 2000% cpu that
> causes issue.

A 12GB heap seems excessive for 1.5 million docs taking up 2GB of space,
unless you are running extremely resource intensive queries -- facets or
grouping on high cardinality fields, for instance.

With other programs on the server, the systems memory may not be fully
available for the operating system to cache the index data.  That is the
secret to good Solr performance -- getting relevant parts of the index
into the OS disk cache so that Solr doesn't need to actually read the
data off the disk.

> *If you can get the screenshot mentioned at the link below, that can *
> *reveal some of the information I have asked for above, but not all of it.*
>
> *https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue
> <https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue>*
> -> What specific information is required ?

The specific information required is the screenshot of a process listing
as described at that page.  You'll need to use a file sharing site, as
this mailing list typically eats email attachments.  This screenshot
provides a very good overview of the system that we can use to determine
whether we expect good performance.

> *What query rate is Solr handling?*
> -> 80-100 query per second on each solr node during peek time.It sums up to
> 250-300 request on 3 replicas of same index.

That's a very high query rate.  It will be even more important for the
system to have the index data in memory.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: very high query time on solr due to high CPU usage

Shawn Heisey-2
On 5/25/2019 5:11 AM, Saurabh Sharma wrote:
> I again faced the issue and restarting the leader worked for me this time.
> Please find attached the top command for further insights.
>
> First java process in screenshot is solr.
>
> Can it be a possibility that there are some issue with this particular node?

Attachments almost never make it to the list.  Your screenshot did not
show up.

You will need to use a file sharing site and provide a link.  And the
file will need to remain on that site long enough for people to look at
it and come to some kind of conclusion.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: very high query time on solr due to high CPU usage

Saurabh Sharma
Hi,

Link to image https://ibb.co/8g6gXwr

Thanks & Regards

On Sat 25 May, 2019, 6:09 PM Shawn Heisey, <[hidden email]> wrote:

> On 5/25/2019 5:11 AM, Saurabh Sharma wrote:
> > I again faced the issue and restarting the leader worked for me this
> time.
> > Please find attached the top command for further insights.
> >
> > First java process in screenshot is solr.
> >
> > Can it be a possibility that there are some issue with this particular
> node?
>
> Attachments almost never make it to the list.  Your screenshot did not
> show up.
>
> You will need to use a file sharing site and provide a link.  And the
> file will need to remain on that site long enough for people to look at
> it and come to some kind of conclusion.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: very high query time on solr due to high CPU usage

Shawn Heisey-2
On 5/25/2019 6:43 AM, Saurabh Sharma wrote:
> Hi,
>
> Link to image https://ibb.co/8g6gXwr

That screenshot is not sorted the way that was mentioned on the wiki
page - by the RES memory column.  I do see several other Java processes
besides Solr.  There might be other high-memory use processes that are
not visible because of the sort order.

It doesn't look like memory is a big problem here -- there is still 5GB
free at the OS level, disk cache usage looks OK, and iowait CPU
percentage seems to be zero.

Based on what I can see, I'm betting you're just throwing more load at
this machine than it can handle currently.  It might be a matter of the
query rate, or it might be that when this happens, Solr is handling
particularly resource-intensive queries.

It does look like you have memory to spare, so you could try increasing
Solr's heap from 12GB to 14GB or 16GB, see if that helps at all.  If it
does, it means the heap was smaller than it needed to be to handle the
kinds of queries Solr was doing.  I'm guessing here.

Thanks,
Shawn