Quantcast

Frequent garbage collections after a day of operation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Frequent garbage collections after a day of operation

Matthias Käppler
Hey everyone,

we're running into some operational problems with our SOLR production
setup here and were wondering if anyone else is affected or has even
solved these problems before. We're running a vanilla SOLR 3.4.0 in
several Tomcat 6 instances, so nothing out of the ordinary, but after
a day or so of operation we see increased response times from SOLR, up
to 3 times increases on average. During this time we see increased CPU
load due to heavy garbage collection in the JVM, which bogs down the
the whole system, so throughput decreases, naturally. When restarting
the slaves, everything goes back to normal, but that's more like a
brute force solution.

The thing is, we don't know what's causing this and we don't have that
much experience with Java stacks since we're for most parts a Rails
company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else
seeing this, or can you think of a reason for this? Most of our
queries to SOLR involve the DismaxHandler and the spatial search query
components. We don't use any custom request handlers so far.

Thanks in advance,
-Matthias

--
Matthias Käppler
Lead Developer API & Mobile

Qype GmbH
Großer Burstah 50-52
20457 Hamburg
Telephone: +49 (0)40 - 219 019 2 - 160
Skype: m_kaeppler
Email: [hidden email]

Managing Director: Ian Brotherston
Amtsgericht Hamburg
HRB 95913

This e-mail and its attachments may contain confidential and/or
privileged information. If you are not the intended recipient (or have
received this e-mail in error) please notify the sender immediately
and destroy this e-mail and its attachments. Any unauthorized copying,
disclosure or distribution of this e-mail and  its attachments is
strictly forbidden. This notice also applies to future messages.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Frequent garbage collections after a day of operation

Chantal Ackermann
Make sure your Tomcat instances are started each with a max heap size
that adds up to something a lot lower than the complete RAM of your
system.

Frequent Garbage collection means that your applications request more
RAM but your Java VM has no more resources, so it requires the Garbage
Collector to free memory so that the requested new objects can be
created. It's not indicating a memory leak unless you are running a
custom EntityProcessor in DIH that runs into an infinite loop and
creates huge amounts of schema fields. ;-)

Also - if you are doing hot deploys on Tomcat, you will have to restart
the Tomcat instance on a regular bases as hot deploys DO leak memory
after a while. (You might be seeing class undeploy messages in
catalina.out and later on OutOfMemory error messages.)

If this is not of any help you will probably have to provide a bit more
information on your Tomcat and SOLR configuration setup.

Chantal


On Thu, 2012-02-16 at 16:22 +0100, Matthias Käppler wrote:

> Hey everyone,
>
> we're running into some operational problems with our SOLR production
> setup here and were wondering if anyone else is affected or has even
> solved these problems before. We're running a vanilla SOLR 3.4.0 in
> several Tomcat 6 instances, so nothing out of the ordinary, but after
> a day or so of operation we see increased response times from SOLR, up
> to 3 times increases on average. During this time we see increased CPU
> load due to heavy garbage collection in the JVM, which bogs down the
> the whole system, so throughput decreases, naturally. When restarting
> the slaves, everything goes back to normal, but that's more like a
> brute force solution.
>
> The thing is, we don't know what's causing this and we don't have that
> much experience with Java stacks since we're for most parts a Rails
> company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else
> seeing this, or can you think of a reason for this? Most of our
> queries to SOLR involve the DismaxHandler and the spatial search query
> components. We don't use any custom request handlers so far.
>
> Thanks in advance,
> -Matthias
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Frequent garbage collections after a day of operation

Bryan Loofbourrow
In reply to this post by Matthias Käppler
A couple of thoughts:

We wound up doing a bunch of tuning on the Java garbage collection.
However, the pattern we were seeing was periodic very extreme slowdowns,
because we were then using the default garbage collector, which blocks
when it has to do a major collection. This doesn't sound like your
problem, but it's something to be aware of.

One thing that could fit the pattern you describe would be Solr caches
filling up and getting you too close to your JVM or memory limit. For
example, if you have large documents, and have defined a large document
cache, that might do it.

I found it useful to point jconsole (free with the JDK) at my JVM, and
watch the pattern of memory usage. If the troughs at the bottom of the GC
cycles keep rising, you know you've got something that is continuing to
grab more memory and not let go of it. Now that our JVM is running
smoothly, we just see a sawtooth pattern, with the troughs approximately
level. When the system is under load, the frequency of the wave rises. Try
it and see what sort of pattern you're getting.

-- Bryan

> -----Original Message-----
> From: Matthias Käppler [mailto:[hidden email]]
> Sent: Thursday, February 16, 2012 7:23 AM
> To: [hidden email]
> Subject: Frequent garbage collections after a day of operation
>
> Hey everyone,
>
> we're running into some operational problems with our SOLR production
> setup here and were wondering if anyone else is affected or has even
> solved these problems before. We're running a vanilla SOLR 3.4.0 in
> several Tomcat 6 instances, so nothing out of the ordinary, but after
> a day or so of operation we see increased response times from SOLR, up
> to 3 times increases on average. During this time we see increased CPU
> load due to heavy garbage collection in the JVM, which bogs down the
> the whole system, so throughput decreases, naturally. When restarting
> the slaves, everything goes back to normal, but that's more like a
> brute force solution.
>
> The thing is, we don't know what's causing this and we don't have that
> much experience with Java stacks since we're for most parts a Rails
> company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else
> seeing this, or can you think of a reason for this? Most of our
> queries to SOLR involve the DismaxHandler and the spatial search query
> components. We don't use any custom request handlers so far.
>
> Thanks in advance,
> -Matthias
>
> --
> Matthias Käppler
> Lead Developer API & Mobile
>
> Qype GmbH
> Großer Burstah 50-52
> 20457 Hamburg
> Telephone: +49 (0)40 - 219 019 2 - 160
> Skype: m_kaeppler
> Email: [hidden email]
>
> Managing Director: Ian Brotherston
> Amtsgericht Hamburg
> HRB 95913
>
> This e-mail and its attachments may contain confidential and/or
> privileged information. If you are not the intended recipient (or have
> received this e-mail in error) please notify the sender immediately
> and destroy this e-mail and its attachments. Any unauthorized copying,
> disclosure or distribution of this e-mail and  its attachments is
> strictly forbidden. This notice also applies to future messages.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Frequent garbage collections after a day of operation

Jason Rutherglen
> One thing that could fit the pattern you describe would be Solr caches
> filling up and getting you too close to your JVM or memory limit

This [uncommitted] issue would solve that problem by allowing the GC
to collect caches that become too large, though in practice, the cache
setting would need to be fairly large for an OOM to occur from them:
https://issues.apache.org/jira/browse/SOLR-1513

On Thu, Feb 16, 2012 at 7:14 PM, Bryan Loofbourrow
<[hidden email]> wrote:

> A couple of thoughts:
>
> We wound up doing a bunch of tuning on the Java garbage collection.
> However, the pattern we were seeing was periodic very extreme slowdowns,
> because we were then using the default garbage collector, which blocks
> when it has to do a major collection. This doesn't sound like your
> problem, but it's something to be aware of.
>
> One thing that could fit the pattern you describe would be Solr caches
> filling up and getting you too close to your JVM or memory limit. For
> example, if you have large documents, and have defined a large document
> cache, that might do it.
>
> I found it useful to point jconsole (free with the JDK) at my JVM, and
> watch the pattern of memory usage. If the troughs at the bottom of the GC
> cycles keep rising, you know you've got something that is continuing to
> grab more memory and not let go of it. Now that our JVM is running
> smoothly, we just see a sawtooth pattern, with the troughs approximately
> level. When the system is under load, the frequency of the wave rises. Try
> it and see what sort of pattern you're getting.
>
> -- Bryan
>
>> -----Original Message-----
>> From: Matthias Käppler [mailto:[hidden email]]
>> Sent: Thursday, February 16, 2012 7:23 AM
>> To: [hidden email]
>> Subject: Frequent garbage collections after a day of operation
>>
>> Hey everyone,
>>
>> we're running into some operational problems with our SOLR production
>> setup here and were wondering if anyone else is affected or has even
>> solved these problems before. We're running a vanilla SOLR 3.4.0 in
>> several Tomcat 6 instances, so nothing out of the ordinary, but after
>> a day or so of operation we see increased response times from SOLR, up
>> to 3 times increases on average. During this time we see increased CPU
>> load due to heavy garbage collection in the JVM, which bogs down the
>> the whole system, so throughput decreases, naturally. When restarting
>> the slaves, everything goes back to normal, but that's more like a
>> brute force solution.
>>
>> The thing is, we don't know what's causing this and we don't have that
>> much experience with Java stacks since we're for most parts a Rails
>> company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else
>> seeing this, or can you think of a reason for this? Most of our
>> queries to SOLR involve the DismaxHandler and the spatial search query
>> components. We don't use any custom request handlers so far.
>>
>> Thanks in advance,
>> -Matthias
>>
>> --
>> Matthias Käppler
>> Lead Developer API & Mobile
>>
>> Qype GmbH
>> Großer Burstah 50-52
>> 20457 Hamburg
>> Telephone: +49 (0)40 - 219 019 2 - 160
>> Skype: m_kaeppler
>> Email: [hidden email]
>>
>> Managing Director: Ian Brotherston
>> Amtsgericht Hamburg
>> HRB 95913
>>
>> This e-mail and its attachments may contain confidential and/or
>> privileged information. If you are not the intended recipient (or have
>> received this e-mail in error) please notify the sender immediately
>> and destroy this e-mail and its attachments. Any unauthorized copying,
>> disclosure or distribution of this e-mail and  its attachments is
>> strictly forbidden. This notice also applies to future messages.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Frequent garbage collections after a day of operation

Erick Erickson
A wonderful writeup on various memory collection concerns
http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/



On Fri, Feb 17, 2012 at 12:27 AM, Jason Rutherglen
<[hidden email]> wrote:

>> One thing that could fit the pattern you describe would be Solr caches
>> filling up and getting you too close to your JVM or memory limit
>
> This [uncommitted] issue would solve that problem by allowing the GC
> to collect caches that become too large, though in practice, the cache
> setting would need to be fairly large for an OOM to occur from them:
> https://issues.apache.org/jira/browse/SOLR-1513
>
> On Thu, Feb 16, 2012 at 7:14 PM, Bryan Loofbourrow
> <[hidden email]> wrote:
>> A couple of thoughts:
>>
>> We wound up doing a bunch of tuning on the Java garbage collection.
>> However, the pattern we were seeing was periodic very extreme slowdowns,
>> because we were then using the default garbage collector, which blocks
>> when it has to do a major collection. This doesn't sound like your
>> problem, but it's something to be aware of.
>>
>> One thing that could fit the pattern you describe would be Solr caches
>> filling up and getting you too close to your JVM or memory limit. For
>> example, if you have large documents, and have defined a large document
>> cache, that might do it.
>>
>> I found it useful to point jconsole (free with the JDK) at my JVM, and
>> watch the pattern of memory usage. If the troughs at the bottom of the GC
>> cycles keep rising, you know you've got something that is continuing to
>> grab more memory and not let go of it. Now that our JVM is running
>> smoothly, we just see a sawtooth pattern, with the troughs approximately
>> level. When the system is under load, the frequency of the wave rises. Try
>> it and see what sort of pattern you're getting.
>>
>> -- Bryan
>>
>>> -----Original Message-----
>>> From: Matthias Käppler [mailto:[hidden email]]
>>> Sent: Thursday, February 16, 2012 7:23 AM
>>> To: [hidden email]
>>> Subject: Frequent garbage collections after a day of operation
>>>
>>> Hey everyone,
>>>
>>> we're running into some operational problems with our SOLR production
>>> setup here and were wondering if anyone else is affected or has even
>>> solved these problems before. We're running a vanilla SOLR 3.4.0 in
>>> several Tomcat 6 instances, so nothing out of the ordinary, but after
>>> a day or so of operation we see increased response times from SOLR, up
>>> to 3 times increases on average. During this time we see increased CPU
>>> load due to heavy garbage collection in the JVM, which bogs down the
>>> the whole system, so throughput decreases, naturally. When restarting
>>> the slaves, everything goes back to normal, but that's more like a
>>> brute force solution.
>>>
>>> The thing is, we don't know what's causing this and we don't have that
>>> much experience with Java stacks since we're for most parts a Rails
>>> company. Are Tomcat 6 or SOLR known to leak memory? Is anyone else
>>> seeing this, or can you think of a reason for this? Most of our
>>> queries to SOLR involve the DismaxHandler and the spatial search query
>>> components. We don't use any custom request handlers so far.
>>>
>>> Thanks in advance,
>>> -Matthias
>>>
>>> --
>>> Matthias Käppler
>>> Lead Developer API & Mobile
>>>
>>> Qype GmbH
>>> Großer Burstah 50-52
>>> 20457 Hamburg
>>> Telephone: +49 (0)40 - 219 019 2 - 160
>>> Skype: m_kaeppler
>>> Email: [hidden email]
>>>
>>> Managing Director: Ian Brotherston
>>> Amtsgericht Hamburg
>>> HRB 95913
>>>
>>> This e-mail and its attachments may contain confidential and/or
>>> privileged information. If you are not the intended recipient (or have
>>> received this e-mail in error) please notify the sender immediately
>>> and destroy this e-mail and its attachments. Any unauthorized copying,
>>> disclosure or distribution of this e-mail and  its attachments is
>>> strictly forbidden. This notice also applies to future messages.
Loading...