Stopping Solr JVM on OOM

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Stopping Solr JVM on OOM

CP Mishra-2
Looking at the previous threads (and in our tests), oom script specified at
command line does not work as OOM exception is trapped and converted to
RuntimeException. So, what is the best way to stop Solr when it gets in OOM
state?  The only way I see is to override multiple handlers and do
System.exit() from there. Is there a better way?

We are using Solr with default Jetty container.

Thanks,
CP Mishra
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Fuad Efendi
The best practice: do not ever try to catch Throwable or its descendants Error, VirtualMachineError, OutOfMemoryError, and etc. 

Never ever.

Also, do not swallow InterruptedException in a loop.

Few simple rules to avoid hanging application. If we follow these, there will be no question "what is the best way to stop Solr when it gets in OOM” (or just becomes irresponsive because of swallowed exceptions)


-- 
Fuad Efendi
416-993-2060(cell)

On February 25, 2016 at 2:37:45 PM, CP Mishra ([hidden email]) wrote:

Looking at the previous threads (and in our tests), oom script specified at  
command line does not work as OOM exception is trapped and converted to  
RuntimeException. So, what is the best way to stop Solr when it gets in OOM  
state? The only way I see is to override multiple handlers and do  
System.exit() from there. Is there a better way?  

We are using Solr with default Jetty container.  

Thanks,  
CP Mishra  
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

CP Mishra-2
Solr & Lucene dev folks must be catching Throwable for a reason. Anyway, I
am asking for solutions that I can use.

On Thu, Feb 25, 2016 at 3:06 PM, Fuad Efendi <[hidden email]> wrote:

> The best practice: do not ever try to catch Throwable or its descendants
> Error, VirtualMachineError, OutOfMemoryError, and etc.
>
> Never ever.
>
> Also, do not swallow InterruptedException in a loop.
>
> Few simple rules to avoid hanging application. If we follow these, there
> will be no question "what is the best way to stop Solr when it gets in OOM”
> (or just becomes irresponsive because of swallowed exceptions)
>
>
> --
> Fuad Efendi
> 416-993-2060(cell)
>
> On February 25, 2016 at 2:37:45 PM, CP Mishra ([hidden email]) wrote:
>
> Looking at the previous threads (and in our tests), oom script specified
> at
> command line does not work as OOM exception is trapped and converted to
> RuntimeException. So, what is the best way to stop Solr when it gets in
> OOM
> state? The only way I see is to override multiple handlers and do
> System.exit() from there. Is there a better way?
>
> We are using Solr with default Jetty container.
>
> Thanks,
> CP Mishra
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Muhammad Zahid Iqbal
You can use ping functionality by setting time-out that suits for your
container/web-apps. If its not working then you can restart your container.
Cheers!

If any other solution I am interested too.

On Fri, Feb 26, 2016 at 2:19 AM, CP Mishra <[hidden email]> wrote:

> Solr & Lucene dev folks must be catching Throwable for a reason. Anyway, I
> am asking for solutions that I can use.
>
> On Thu, Feb 25, 2016 at 3:06 PM, Fuad Efendi <[hidden email]> wrote:
>
> > The best practice: do not ever try to catch Throwable or its descendants
> > Error, VirtualMachineError, OutOfMemoryError, and etc.
> >
> > Never ever.
> >
> > Also, do not swallow InterruptedException in a loop.
> >
> > Few simple rules to avoid hanging application. If we follow these, there
> > will be no question "what is the best way to stop Solr when it gets in
> OOM”
> > (or just becomes irresponsive because of swallowed exceptions)
> >
> >
> > --
> > Fuad Efendi
> > 416-993-2060(cell)
> >
> > On February 25, 2016 at 2:37:45 PM, CP Mishra ([hidden email])
> wrote:
> >
> > Looking at the previous threads (and in our tests), oom script specified
> > at
> > command line does not work as OOM exception is trapped and converted to
> > RuntimeException. So, what is the best way to stop Solr when it gets in
> > OOM
> > state? The only way I see is to override multiple handlers and do
> > System.exit() from there. Is there a better way?
> >
> > We are using Solr with default Jetty container.
> >
> > Thanks,
> > CP Mishra
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Shawn Heisey-2
In reply to this post by Fuad Efendi
On 2/25/2016 2:06 PM, Fuad Efendi wrote:
> The best practice: do not ever try to catch Throwable or its descendants Error, VirtualMachineError, OutOfMemoryError, and etc.
>
> Never ever.
>
> Also, do not swallow InterruptedException in a loop.
>
> Few simple rules to avoid hanging application. If we follow these, there will be no question "what is the best way to stop Solr when it gets in OOM” (or just becomes irresponsive because of swallowed exceptions)

As I understand from SOLR-8539, if an OOM is thrown by a Java program
and there is a properly configured OOM script, regardless of what
happens with exception rewrapping, the script *should* kick in.  Here's
an issue where this behavior was verified by a Jetty developer on a
small-scale test program which catches and swallows the OOM:

https://issues.apache.org/jira/browse/SOLR-8539

Solr 5.x, when started on Linux/UNIX systems with the included shell
scripts, comes default with an "oom killer" script that is supposed to
stop Solr when OOM occurs.

Recently it was discovered that the OnOutOfMemoryError option in the
start script for Linux/UNIX was being incorrectly specified on the
command line -- it doesn't actually work.  Here's the issue for that
problem:

https://issues.apache.org/jira/browse/SOLR-8145

The fix for the incorrect OnOutOfMemoryError usage will be in version
6.0 when that version is finally released, which I think will make the
OOM killer actually work on Linux/UNIX.  There is currently no concrete
information on when 6.0 is expected.  If any plans for future 5.x
versions come up, that fix will likely make it into those versions as well.

There is no OOM killer script for Windows, so this feature is not
present when running on Windows.  If somebody can come up with a way for
Windows to find and kill the Solr process, I'd be happy to include it.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Binoy Dalal
Hi Shawn,
I've just finished writing a batch oom killer script and it seems to work
fine.

I couldn't try it on the actual solr process since I'm a bit stumped on how
I can make solr throw an oom at will.
Although I did write another code that does throw an oom upon which this
script is called and the running solr process is killed.

I would like to know how I should proceed from here with submitting the
code for review etc.

Thanks.

On Tue, 8 Mar 2016, 00:56 Shawn Heisey, <[hidden email]> wrote:

> On 2/25/2016 2:06 PM, Fuad Efendi wrote:
> > The best practice: do not ever try to catch Throwable or its descendants
> Error, VirtualMachineError, OutOfMemoryError, and etc.
> >
> > Never ever.
> >
> > Also, do not swallow InterruptedException in a loop.
> >
> > Few simple rules to avoid hanging application. If we follow these, there
> will be no question "what is the best way to stop Solr when it gets in OOM”
> (or just becomes irresponsive because of swallowed exceptions)
>
> As I understand from SOLR-8539, if an OOM is thrown by a Java program
> and there is a properly configured OOM script, regardless of what
> happens with exception rewrapping, the script *should* kick in.  Here's
> an issue where this behavior was verified by a Jetty developer on a
> small-scale test program which catches and swallows the OOM:
>
> https://issues.apache.org/jira/browse/SOLR-8539
>
> Solr 5.x, when started on Linux/UNIX systems with the included shell
> scripts, comes default with an "oom killer" script that is supposed to
> stop Solr when OOM occurs.
>
> Recently it was discovered that the OnOutOfMemoryError option in the
> start script for Linux/UNIX was being incorrectly specified on the
> command line -- it doesn't actually work.  Here's the issue for that
> problem:
>
> https://issues.apache.org/jira/browse/SOLR-8145
>
> The fix for the incorrect OnOutOfMemoryError usage will be in version
> 6.0 when that version is finally released, which I think will make the
> OOM killer actually work on Linux/UNIX.  There is currently no concrete
> information on when 6.0 is expected.  If any plans for future 5.x
> versions come up, that fix will likely make it into those versions as well.
>
> There is no OOM killer script for Windows, so this feature is not
> present when running on Windows.  If somebody can come up with a way for
> Windows to find and kill the Solr process, I'd be happy to include it.
>
> Thanks,
> Shawn
>
> --
Regards,
Binoy Dalal
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Shawn Heisey-2
On 3/8/2016 5:13 AM, Binoy Dalal wrote:

> I've just finished writing a batch oom killer script and it seems to work
> fine.
>
> I couldn't try it on the actual solr process since I'm a bit stumped on how
> I can make solr throw an oom at will.
> Although I did write another code that does throw an oom upon which this
> script is called and the running solr process is killed.
>
> I would like to know how I should proceed from here with submitting the
> code for review etc.

Open an Improvement issue on the SOLR project in Apache's Jira with a
title like "OOM killer for Windows" and a useful description.  Clone the
source code from git, make your changes/additions.  Create a patch using
"git diff" and upload it using SOLR-NNNN.patch as the filename -- the
same name as the Jira issue.

Making Solr OOM on purpose is possible, but it is usually better to
write a small test program with an intentional memory leak.

I wonder if we can write a test for OOM death.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Binoy Dalal
Hi Shawn,
The JIRA issue is SOLR-8803 (https://issues.apache.org/jira/browse/SOLR-8803
).
I've used "git diff" and created a patch but it only has the changes that I
made to the solr.cmd file under bin to add the -XX:OnOutOfMemoryError
option.
There's the entire file of the actual OOM kill script that does not show in
the patch.
Do I upload this file along with the patch or is there something else I've
to do to put in the new file.
Please advise.

Thanks.


On Tue, Mar 8, 2016 at 7:03 PM Shawn Heisey <[hidden email]> wrote:

> On 3/8/2016 5:13 AM, Binoy Dalal wrote:
> > I've just finished writing a batch oom killer script and it seems to work
> > fine.
> >
> > I couldn't try it on the actual solr process since I'm a bit stumped on
> how
> > I can make solr throw an oom at will.
> > Although I did write another code that does throw an oom upon which this
> > script is called and the running solr process is killed.
> >
> > I would like to know how I should proceed from here with submitting the
> > code for review etc.
>
> Open an Improvement issue on the SOLR project in Apache's Jira with a
> title like "OOM killer for Windows" and a useful description.  Clone the
> source code from git, make your changes/additions.  Create a patch using
> "git diff" and upload it using SOLR-NNNN.patch as the filename -- the
> same name as the Jira issue.
>
> Making Solr OOM on purpose is possible, but it is usually better to
> write a small test program with an intentional memory leak.
>
> I wonder if we can write a test for OOM death.
>
> Thanks,
> Shawn
>
> --
Regards,
Binoy Dalal
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Binoy Dalal
I've uploaded both files.
Please review and advise.

On Wed, Mar 9, 2016 at 12:46 AM Binoy Dalal <[hidden email]> wrote:

> Hi Shawn,
> The JIRA issue is SOLR-8803 (
> https://issues.apache.org/jira/browse/SOLR-8803).
> I've used "git diff" and created a patch but it only has the changes that
> I made to the solr.cmd file under bin to add the -XX:OnOutOfMemoryError
> option.
> There's the entire file of the actual OOM kill script that does not show
> in the patch.
> Do I upload this file along with the patch or is there something else I've
> to do to put in the new file.
> Please advise.
>
> Thanks.
>
>
> On Tue, Mar 8, 2016 at 7:03 PM Shawn Heisey <[hidden email]> wrote:
>
>> On 3/8/2016 5:13 AM, Binoy Dalal wrote:
>> > I've just finished writing a batch oom killer script and it seems to
>> work
>> > fine.
>> >
>> > I couldn't try it on the actual solr process since I'm a bit stumped on
>> how
>> > I can make solr throw an oom at will.
>> > Although I did write another code that does throw an oom upon which this
>> > script is called and the running solr process is killed.
>> >
>> > I would like to know how I should proceed from here with submitting the
>> > code for review etc.
>>
>> Open an Improvement issue on the SOLR project in Apache's Jira with a
>> title like "OOM killer for Windows" and a useful description.  Clone the
>> source code from git, make your changes/additions.  Create a patch using
>> "git diff" and upload it using SOLR-NNNN.patch as the filename -- the
>> same name as the Jira issue.
>>
>> Making Solr OOM on purpose is possible, but it is usually better to
>> write a small test program with an intentional memory leak.
>>
>> I wonder if we can write a test for OOM death.
>>
>> Thanks,
>> Shawn
>>
>> --
> Regards,
> Binoy Dalal
>
--
Regards,
Binoy Dalal
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Binoy Dalal
Hi Shawn,
Have you had a chance to check and review the patch?

On Wed, 9 Mar 2016, 00:49 Binoy Dalal, <[hidden email]> wrote:

> I've uploaded both files.
> Please review and advise.
>
> On Wed, Mar 9, 2016 at 12:46 AM Binoy Dalal <[hidden email]>
> wrote:
>
>> Hi Shawn,
>> The JIRA issue is SOLR-8803 (
>> https://issues.apache.org/jira/browse/SOLR-8803).
>> I've used "git diff" and created a patch but it only has the changes that
>> I made to the solr.cmd file under bin to add the -XX:OnOutOfMemoryError
>> option.
>> There's the entire file of the actual OOM kill script that does not show
>> in the patch.
>> Do I upload this file along with the patch or is there something else
>> I've to do to put in the new file.
>> Please advise.
>>
>> Thanks.
>>
>>
>> On Tue, Mar 8, 2016 at 7:03 PM Shawn Heisey <[hidden email]> wrote:
>>
>>> On 3/8/2016 5:13 AM, Binoy Dalal wrote:
>>> > I've just finished writing a batch oom killer script and it seems to
>>> work
>>> > fine.
>>> >
>>> > I couldn't try it on the actual solr process since I'm a bit stumped
>>> on how
>>> > I can make solr throw an oom at will.
>>> > Although I did write another code that does throw an oom upon which
>>> this
>>> > script is called and the running solr process is killed.
>>> >
>>> > I would like to know how I should proceed from here with submitting the
>>> > code for review etc.
>>>
>>> Open an Improvement issue on the SOLR project in Apache's Jira with a
>>> title like "OOM killer for Windows" and a useful description.  Clone the
>>> source code from git, make your changes/additions.  Create a patch using
>>> "git diff" and upload it using SOLR-NNNN.patch as the filename -- the
>>> same name as the Jira issue.
>>>
>>> Making Solr OOM on purpose is possible, but it is usually better to
>>> write a small test program with an intentional memory leak.
>>>
>>> I wonder if we can write a test for OOM death.
>>>
>>> Thanks,
>>> Shawn
>>>
>>> --
>> Regards,
>> Binoy Dalal
>>
> --
> Regards,
> Binoy Dalal
>
--
Regards,
Binoy Dalal
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Shawn Heisey-2
On 3/9/2016 6:07 AM, Binoy Dalal wrote:
> Have you had a chance to check and review the patch?

I have not.  I will look at it sometime today, probably later this
evening (UTC-7 timezone).

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Binoy Dalal
Hello Shawn,
I made the necessary changes to that oom script?
How does it look now?
Also can you suggest some way of testing it with solr?
How do I make solr oom on purpose?

Thanks

On Wed, 9 Mar 2016, 19:11 Shawn Heisey, <[hidden email]> wrote:

> On 3/9/2016 6:07 AM, Binoy Dalal wrote:
> > Have you had a chance to check and review the patch?
>
> I have not.  I will look at it sometime today, probably later this
> evening (UTC-7 timezone).
>
> Thanks,
> Shawn
>
> --
Regards,
Binoy Dalal
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Shawn Heisey-2
On 3/13/2016 8:13 PM, Binoy Dalal wrote:
> I made the necessary changes to that oom script?
> How does it look now?
> Also can you suggest some way of testing it with solr?
> How do I make solr oom on purpose?

Set the java heap really small.  Not entirely sure what value to use.
I'd probably start with 32m and work my way down.  With a small enough
heap, you could probably produce OOM without even trying to USE Solr.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Binoy Dalal
I set the heap to 16 mb and tried to index about 350k records using a DIH.
This did throw an OOM for that particular thread in the console, but the
oom script wasn't called and solr was running properly.
Moreover, solr also managed to index all 350k records.

Is this the correct way to o about getting solr to throw an oom?
If so where did I go wrong?
If not, what other alternative is there?

Thanks.

PS. I tried to start solr with really low memory (abt. 2k) but that just
threw an error saying too small a heap and the JVM didn't start at all.

On Mon, 14 Mar 2016, 07:57 Shawn Heisey, <[hidden email]> wrote:

> On 3/13/2016 8:13 PM, Binoy Dalal wrote:
> > I made the necessary changes to that oom script?
> > How does it look now?
> > Also can you suggest some way of testing it with solr?
> > How do I make solr oom on purpose?
>
> Set the java heap really small.  Not entirely sure what value to use.
> I'd probably start with 32m and work my way down.  With a small enough
> heap, you could probably produce OOM without even trying to USE Solr.
>
> Thanks,
> Shawn
>
> --
Regards,
Binoy Dalal
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

Binoy Dalal
Hi Shawn,
Your thoughts on this?

On Mon, Mar 14, 2016 at 2:11 PM Binoy Dalal <[hidden email]> wrote:

> I set the heap to 16 mb and tried to index about 350k records using a DIH.
> This did throw an OOM for that particular thread in the console, but the
> oom script wasn't called and solr was running properly.
> Moreover, solr also managed to index all 350k records.
>
> Is this the correct way to o about getting solr to throw an oom?
> If so where did I go wrong?
> If not, what other alternative is there?
>
> Thanks.
>
> PS. I tried to start solr with really low memory (abt. 2k) but that just
> threw an error saying too small a heap and the JVM didn't start at all.
>
> On Mon, 14 Mar 2016, 07:57 Shawn Heisey, <[hidden email]> wrote:
>
>> On 3/13/2016 8:13 PM, Binoy Dalal wrote:
>> > I made the necessary changes to that oom script?
>> > How does it look now?
>> > Also can you suggest some way of testing it with solr?
>> > How do I make solr oom on purpose?
>>
>> Set the java heap really small.  Not entirely sure what value to use.
>> I'd probably start with 32m and work my way down.  With a small enough
>> heap, you could probably produce OOM without even trying to USE Solr.
>>
>> Thanks,
>> Shawn
>>
>> --
> Regards,
> Binoy Dalal
>
--
Regards,
Binoy Dalal
Reply | Threaded
Open this post in threaded view
|

Re: Stopping Solr JVM on OOM

jmlucjav
In reply to this post by Binoy Dalal
In order to force a OOM do this:

- index a sizable amount of docs with normal -Xmx, if you already have 350k
docs indexed, that should be enough
- now, stop solr and decrease memory, like -Xmx=15m, start it, and run a
query with a facet on a field with very high cardinality, ask for all
facets. If not enough, add another facet field etc. This is a sure way to
get OOM

On Mon, Mar 14, 2016 at 9:42 AM, Binoy Dalal <[hidden email]> wrote:

> I set the heap to 16 mb and tried to index about 350k records using a DIH.
> This did throw an OOM for that particular thread in the console, but the
> oom script wasn't called and solr was running properly.
> Moreover, solr also managed to index all 350k records.
>
> Is this the correct way to o about getting solr to throw an oom?
> If so where did I go wrong?
> If not, what other alternative is there?
>
> Thanks.
>
> PS. I tried to start solr with really low memory (abt. 2k) but that just
> threw an error saying too small a heap and the JVM didn't start at all.
>
> On Mon, 14 Mar 2016, 07:57 Shawn Heisey, <[hidden email]> wrote:
>
> > On 3/13/2016 8:13 PM, Binoy Dalal wrote:
> > > I made the necessary changes to that oom script?
> > > How does it look now?
> > > Also can you suggest some way of testing it with solr?
> > > How do I make solr oom on purpose?
> >
> > Set the java heap really small.  Not entirely sure what value to use.
> > I'd probably start with 32m and work my way down.  With a small enough
> > heap, you could probably produce OOM without even trying to USE Solr.
> >
> > Thanks,
> > Shawn
> >
> > --
> Regards,
> Binoy Dalal
>