Heap Memory Problem after Upgrading to 7.4.0

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Heap Memory Problem after Upgrading to 7.4.0

Björn Häuser
Hello,

we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.

Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.

The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.

The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>

Analyzing the heap dump eclipse MAT says this:

Problem Suspect 1

91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.

Biggest instances:

        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.


Problem Suspect 2

223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.


Any help is appreciated. Thank you very much!
Björn
Reply | Threaded
Open this post in threaded view
|

Re: Heap Memory Problem after Upgrading to 7.4.0

Erick Erickson
I would expect at least 1 IndexSearcher per replica, how many total
replicas hosted in your JVM?

Plus, if you're actively indexing, there may temporarily be 2
IndexSearchers open while the new searcher warms.

And there may be quite a few caches, at least queryResultCache and
filterCache and documentCache, one of each per replica and maybe two
(for queryResultCache and filterCache) if you have a background
searcher autowarming.

At a glance, your autowarm counts are very high, so it may take some
time to autowarm leading to multiple IndexSearchers and caches open
per replica when you happen to hit a commit point. I usually start
with 16-20 as an autowarm count, the benefit decreases rapidly as you
increase the count.

I'm not quite sure why it would be different in 7x .vs. 6x. How much
heap do you allocate to the JVM? And do you see similar heap dumps in
6.6?

Best,
Erick
On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]> wrote:

>
> Hello,
>
> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.
>
> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.
>
> The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.
>
> The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
>
> Analyzing the heap dump eclipse MAT says this:
>
> Problem Suspect 1
>
> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.
>
> Biggest instances:
>
>         • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
>         • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
>         • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.
>
>
> Problem Suspect 2
>
> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.
>
>
> Any help is appreciated. Thank you very much!
> Björn
Reply | Threaded
Open this post in threaded view
|

Re: Heap Memory Problem after Upgrading to 7.4.0

Björn Häuser
Hi Erick,

thank you for your answer.

Unfortunately I do not have a heap dump from 6.6.


> On 3. Sep 2018, at 20:48, Erick Erickson <[hidden email]> wrote:
>
> I would expect at least 1 IndexSearcher per replica, how many total
> replicas hosted in your JVM?

27 replicas per JVM.

>
> Plus, if you're actively indexing, there may temporarily be 2
> IndexSearchers open while the new searcher warms.
>
> And there may be quite a few caches, at least queryResultCache and
> filterCache and documentCache, one of each per replica and maybe two
> (for queryResultCache and filterCache) if you have a background
> searcher autowarming.
>
> At a glance, your autowarm counts are very high, so it may take some
> time to autowarm leading to multiple IndexSearchers and caches open
> per replica when you happen to hit a commit point. I usually start
> with 16-20 as an autowarm count, the benefit decreases rapidly as you
> increase the count.

As a counter measure I reduced the autowarm counts now per API calls to 10. Let me see if the system is now more stable. Tomorrow morning I will create a new heap dump, to see if the there are searchers.

Is there any metrics which could tell me that without a heap dump?

>
> I'm not quite sure why it would be different in 7x .vs. 6x. How much
> heap do you allocate to the JVM? And do you see similar heap dumps in
> 6.6?
>
> Best,
> Erick

Thanks Erick!

 Björn


> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]> wrote:
>>
>> Hello,
>>
>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.
>>
>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.
>>
>> The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.
>>
>> The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
>>
>> Analyzing the heap dump eclipse MAT says this:
>>
>> Problem Suspect 1
>>
>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.
>>
>> Biggest instances:
>>
>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
>>        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.
>>
>>
>> Problem Suspect 2
>>
>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.
>>
>>
>> Any help is appreciated. Thank you very much!
>> Björn

Reply | Threaded
Open this post in threaded view
|

RE: Heap Memory Problem after Upgrading to 7.4.0

Markus Jelsma-2
In reply to this post by Björn Häuser
Hello,

Getting an OOM plus the fact you are having a lot of IndexSearcher instances rings a familiar bell. One of our collections has the same issue [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all our custom Solr code but had to keep our Lucene filters in the schema, the problem persisted.

The odd thing, however, is that you appear to have the same problem, but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can you confirm the problem is not also in 7.3.0?

You should see the instance count for IndexSearcher increase by one for each replica on each commit.

Regards,
Markus

[1] http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html 

 
 
-----Original message-----

> From:Erick Erickson <[hidden email]>
> Sent: Monday 3rd September 2018 20:49
> To: solr-user <[hidden email]>
> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
>
> I would expect at least 1 IndexSearcher per replica, how many total
> replicas hosted in your JVM?
>
> Plus, if you're actively indexing, there may temporarily be 2
> IndexSearchers open while the new searcher warms.
>
> And there may be quite a few caches, at least queryResultCache and
> filterCache and documentCache, one of each per replica and maybe two
> (for queryResultCache and filterCache) if you have a background
> searcher autowarming.
>
> At a glance, your autowarm counts are very high, so it may take some
> time to autowarm leading to multiple IndexSearchers and caches open
> per replica when you happen to hit a commit point. I usually start
> with 16-20 as an autowarm count, the benefit decreases rapidly as you
> increase the count.
>
> I'm not quite sure why it would be different in 7x .vs. 6x. How much
> heap do you allocate to the JVM? And do you see similar heap dumps in
> 6.6?
>
> Best,
> Erick
> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]> wrote:
> >
> > Hello,
> >
> > we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.
> >
> > Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.
> >
> > The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.
> >
> > The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
> >
> > Analyzing the heap dump eclipse MAT says this:
> >
> > Problem Suspect 1
> >
> > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.
> >
> > Biggest instances:
> >
> >         • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
> >         • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
> >         • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.
> >
> >
> > Problem Suspect 2
> >
> > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.
> >
> >
> > Any help is appreciated. Thank you very much!
> > Björn
>
Reply | Threaded
Open this post in threaded view
|

Re: Heap Memory Problem after Upgrading to 7.4.0

Erick Erickson
Reducing to 10 won't be definitive, but if the problem gets better
it'll be a clue.

How are you committing? Is it just based on the solrconfig settings or
do you have any clients submitting commit commands?

One fat clue would be if, in your solr logs, you were getting any
warnings about "too many on deck searchers" (going from memory here,
exact wording may differ). That's an indication that your autowarm
times are taking longer than 20 seconds (your soft commit interval),
which would point to excessive autowarming being _part_ of the
problem. This assumes you're indexing steadily.

Still, though, changing from 6.6 to 7x shouldn't be that much different.

It's possible that you were running close to your heap limit with 6.6
and a relatively small difference in heap usage with 7x threw you over
the tipping point, but that's just hand-waving on my part.

And I'm guessing this is a prod system so experiments aren't tolerable...

What you can measure. Starting with 6.4 there are about a zillion metrics,
try: <a href="http://host:port/solr/admin/metrics">http://host:port/solr/admin/metrics for the complete list and
pick and choose.

Note that there are ways to cut down on how much is reported, I
suspect you'll be interested first in:
http://localhost:8983/solr/admin/metrics?prefix=SEARCHER

https://lucene.apache.org/solr/guide/7_1/metrics-reporting.html

These tend to be on a per-core (replica) basis so you may have to do
some aggregating.

Good luck!
Erick
On Mon, Sep 3, 2018 at 12:54 PM Markus Jelsma
<[hidden email]> wrote:

>
> Hello,
>
> Getting an OOM plus the fact you are having a lot of IndexSearcher instances rings a familiar bell. One of our collections has the same issue [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all our custom Solr code but had to keep our Lucene filters in the schema, the problem persisted.
>
> The odd thing, however, is that you appear to have the same problem, but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can you confirm the problem is not also in 7.3.0?
>
> You should see the instance count for IndexSearcher increase by one for each replica on each commit.
>
> Regards,
> Markus
>
> [1] http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html
>
>
>
> -----Original message-----
> > From:Erick Erickson <[hidden email]>
> > Sent: Monday 3rd September 2018 20:49
> > To: solr-user <[hidden email]>
> > Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> >
> > I would expect at least 1 IndexSearcher per replica, how many total
> > replicas hosted in your JVM?
> >
> > Plus, if you're actively indexing, there may temporarily be 2
> > IndexSearchers open while the new searcher warms.
> >
> > And there may be quite a few caches, at least queryResultCache and
> > filterCache and documentCache, one of each per replica and maybe two
> > (for queryResultCache and filterCache) if you have a background
> > searcher autowarming.
> >
> > At a glance, your autowarm counts are very high, so it may take some
> > time to autowarm leading to multiple IndexSearchers and caches open
> > per replica when you happen to hit a commit point. I usually start
> > with 16-20 as an autowarm count, the benefit decreases rapidly as you
> > increase the count.
> >
> > I'm not quite sure why it would be different in 7x .vs. 6x. How much
> > heap do you allocate to the JVM? And do you see similar heap dumps in
> > 6.6?
> >
> > Best,
> > Erick
> > On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]> wrote:
> > >
> > > Hello,
> > >
> > > we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.
> > >
> > > Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.
> > >
> > > The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.
> > >
> > > The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
> > >
> > > Analyzing the heap dump eclipse MAT says this:
> > >
> > > Problem Suspect 1
> > >
> > > 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.
> > >
> > > Biggest instances:
> > >
> > >         • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
> > >         • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
> > >         • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.
> > >
> > >
> > > Problem Suspect 2
> > >
> > > 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.
> > >
> > >
> > > Any help is appreciated. Thank you very much!
> > > Björn
> >
Reply | Threaded
Open this post in threaded view
|

Re: Heap Memory Problem after Upgrading to 7.4.0

Björn Häuser
In reply to this post by Markus Jelsma-2
Hi Markus,

this reads exactly like what we have. Where you able to figure out anything? Currently thinking about rollbacking to 7.2.1.



> On 3. Sep 2018, at 21:54, Markus Jelsma <[hidden email]> wrote:
>
> Hello,
>
> Getting an OOM plus the fact you are having a lot of IndexSearcher instances rings a familiar bell. One of our collections has the same issue [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all our custom Solr code but had to keep our Lucene filters in the schema, the problem persisted.
>
> The odd thing, however, is that you appear to have the same problem, but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can you confirm the problem is not also in 7.3.0?
>

We had very similar problems with 7.3.0 but never analyzed them and just updated to 7.4.0 because I thought thats the bug we hit: https://issues.apache.org/jira/browse/SOLR-11882 <https://issues.apache.org/jira/browse/SOLR-11882>


> You should see the instance count for IndexSearcher increase by one for each replica on each commit.


Sorry, where can I find this? ;) Sorry, did not find anything.

Thanks
Björn

>
> Regards,
> Markus
>
> [1] http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html 
>
>
>
> -----Original message-----
>> From:Erick Erickson <[hidden email]>
>> Sent: Monday 3rd September 2018 20:49
>> To: solr-user <[hidden email]>
>> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
>>
>> I would expect at least 1 IndexSearcher per replica, how many total
>> replicas hosted in your JVM?
>>
>> Plus, if you're actively indexing, there may temporarily be 2
>> IndexSearchers open while the new searcher warms.
>>
>> And there may be quite a few caches, at least queryResultCache and
>> filterCache and documentCache, one of each per replica and maybe two
>> (for queryResultCache and filterCache) if you have a background
>> searcher autowarming.
>>
>> At a glance, your autowarm counts are very high, so it may take some
>> time to autowarm leading to multiple IndexSearchers and caches open
>> per replica when you happen to hit a commit point. I usually start
>> with 16-20 as an autowarm count, the benefit decreases rapidly as you
>> increase the count.
>>
>> I'm not quite sure why it would be different in 7x .vs. 6x. How much
>> heap do you allocate to the JVM? And do you see similar heap dumps in
>> 6.6?
>>
>> Best,
>> Erick
>> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]> wrote:
>>>
>>> Hello,
>>>
>>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.
>>>
>>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.
>>>
>>> The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.
>>>
>>> The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
>>>
>>> Analyzing the heap dump eclipse MAT says this:
>>>
>>> Problem Suspect 1
>>>
>>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.
>>>
>>> Biggest instances:
>>>
>>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
>>>        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
>>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.
>>>
>>>
>>> Problem Suspect 2
>>>
>>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.
>>>
>>>
>>> Any help is appreciated. Thank you very much!
>>> Björn
>>

Reply | Threaded
Open this post in threaded view
|

Re: Heap Memory Problem after Upgrading to 7.4.0

Björn Häuser
In reply to this post by Erick Erickson
Hi,


> On 3. Sep 2018, at 22:18, Erick Erickson <[hidden email]> wrote:
>
> Reducing to 10 won't be definitive, but if the problem gets better
> it'll be a clue.
>
> How are you committing? Is it just based on the solrconfig settings or
> do you have any clients submitting commit commands?

Only through the auto commits, no manual committing from the application.

>
> One fat clue would be if, in your solr logs, you were getting any
> warnings about "too many on deck searchers" (going from memory here,
> exact wording may differ). That's an indication that your autowarm
> times are taking longer than 20 seconds (your soft commit interval),
> which would point to excessive autowarming being _part_ of the
> problem. This assumes you're indexing steadily.

I searched our logs and could not find any evidence for this. I searched for:

- searchers
- auto
- warmup

There was nothing about too many searchers. Which would mean they are actually leaking and not too many warming up right?

>
> Still, though, changing from 6.6 to 7x shouldn't be that much different.
>
> It's possible that you were running close to your heap limit with 6.6
> and a relatively small difference in heap usage with 7x threw you over
> the tipping point, but that's just hand-waving on my part.
>

I really thought about this, but in our 6.6. times we had a lot of head from in the young generation and also very log gc timings.


> And I'm guessing this is a prod system so experiments aren't tolerable…

What do you have in mind? Increasing memory? Thats something we anyway have todo - if it helps.
Our current setup is not very stable anyway, so we have some room for experiments.

>
> What you can measure. Starting with 6.4 there are about a zillion metrics,
> try: <a href="http://host:port/solr/admin/metrics">http://host:port/solr/admin/metrics for the complete list and
> pick and choose.
>
> Note that there are ways to cut down on how much is reported, I
> suspect you'll be interested first in:
> http://localhost:8983/solr/admin/metrics?prefix=SEARCHER
>
> https://lucene.apache.org/solr/guide/7_1/metrics-reporting.html
>

Funny thing is that we tried to use the prometheus exporter for these metrics, but whenever we started it it killed our solr node immediately.

I will try to look into these metrics, but looking at them yields no valuable results for me. All metrics are “fine”.

Is there anything special you would take a look at?

> These tend to be on a per-core (replica) basis so you may have to do
> some aggregating.
>
> Good luck!


Thank you very much :)
Björn

> Erick
> On Mon, Sep 3, 2018 at 12:54 PM Markus Jelsma
> <[hidden email]> wrote:
>>
>> Hello,
>>
>> Getting an OOM plus the fact you are having a lot of IndexSearcher instances rings a familiar bell. One of our collections has the same issue [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all our custom Solr code but had to keep our Lucene filters in the schema, the problem persisted.
>>
>> The odd thing, however, is that you appear to have the same problem, but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can you confirm the problem is not also in 7.3.0?
>>
>> You should see the instance count for IndexSearcher increase by one for each replica on each commit.
>>
>> Regards,
>> Markus
>>
>> [1] http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html
>>
>>
>>
>> -----Original message-----
>>> From:Erick Erickson <[hidden email]>
>>> Sent: Monday 3rd September 2018 20:49
>>> To: solr-user <[hidden email]>
>>> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
>>>
>>> I would expect at least 1 IndexSearcher per replica, how many total
>>> replicas hosted in your JVM?
>>>
>>> Plus, if you're actively indexing, there may temporarily be 2
>>> IndexSearchers open while the new searcher warms.
>>>
>>> And there may be quite a few caches, at least queryResultCache and
>>> filterCache and documentCache, one of each per replica and maybe two
>>> (for queryResultCache and filterCache) if you have a background
>>> searcher autowarming.
>>>
>>> At a glance, your autowarm counts are very high, so it may take some
>>> time to autowarm leading to multiple IndexSearchers and caches open
>>> per replica when you happen to hit a commit point. I usually start
>>> with 16-20 as an autowarm count, the benefit decreases rapidly as you
>>> increase the count.
>>>
>>> I'm not quite sure why it would be different in 7x .vs. 6x. How much
>>> heap do you allocate to the JVM? And do you see similar heap dumps in
>>> 6.6?
>>>
>>> Best,
>>> Erick
>>> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]> wrote:
>>>>
>>>> Hello,
>>>>
>>>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.
>>>>
>>>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.
>>>>
>>>> The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.
>>>>
>>>> The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
>>>>
>>>> Analyzing the heap dump eclipse MAT says this:
>>>>
>>>> Problem Suspect 1
>>>>
>>>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.
>>>>
>>>> Biggest instances:
>>>>
>>>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
>>>>        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
>>>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.
>>>>
>>>>
>>>> Problem Suspect 2
>>>>
>>>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.
>>>>
>>>>
>>>> Any help is appreciated. Thank you very much!
>>>> Björn
>>>

Reply | Threaded
Open this post in threaded view
|

RE: Heap Memory Problem after Upgrading to 7.4.0

Markus Jelsma-2
In reply to this post by Björn Häuser
Hello Björn,

Take great care, 7.2.1 cannot read an index written by 7.4.0, so you cannot roll back but need to reindex!

Andrey Kudryavtsev made a good suggestion in the thread on how to find the culprit, but it will be a tedious task. I have not yet had the time or courage to venture there.

Hope it helps,
Markus

 
 
-----Original message-----

> From:Björn Häuser <[hidden email]>
> Sent: Monday 3rd September 2018 22:28
> To: [hidden email]
> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
>
> Hi Markus,
>
> this reads exactly like what we have. Where you able to figure out anything? Currently thinking about rollbacking to 7.2.1.
>
>
>
> > On 3. Sep 2018, at 21:54, Markus Jelsma <[hidden email]> wrote:
> >
> > Hello,
> >
> > Getting an OOM plus the fact you are having a lot of IndexSearcher instances rings a familiar bell. One of our collections has the same issue [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all our custom Solr code but had to keep our Lucene filters in the schema, the problem persisted.
> >
> > The odd thing, however, is that you appear to have the same problem, but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can you confirm the problem is not also in 7.3.0?
> >
>
> We had very similar problems with 7.3.0 but never analyzed them and just updated to 7.4.0 because I thought thats the bug we hit: https://issues.apache.org/jira/browse/SOLR-11882 <https://issues.apache.org/jira/browse/SOLR-11882>
>
>
> > You should see the instance count for IndexSearcher increase by one for each replica on each commit.
>
>
> Sorry, where can I find this? ;) Sorry, did not find anything.
>
> Thanks
> Björn
>
> >
> > Regards,
> > Markus
> >
> > [1] http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html 
> >
> >
> >
> > -----Original message-----
> >> From:Erick Erickson <[hidden email]>
> >> Sent: Monday 3rd September 2018 20:49
> >> To: solr-user <[hidden email]>
> >> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> >>
> >> I would expect at least 1 IndexSearcher per replica, how many total
> >> replicas hosted in your JVM?
> >>
> >> Plus, if you're actively indexing, there may temporarily be 2
> >> IndexSearchers open while the new searcher warms.
> >>
> >> And there may be quite a few caches, at least queryResultCache and
> >> filterCache and documentCache, one of each per replica and maybe two
> >> (for queryResultCache and filterCache) if you have a background
> >> searcher autowarming.
> >>
> >> At a glance, your autowarm counts are very high, so it may take some
> >> time to autowarm leading to multiple IndexSearchers and caches open
> >> per replica when you happen to hit a commit point. I usually start
> >> with 16-20 as an autowarm count, the benefit decreases rapidly as you
> >> increase the count.
> >>
> >> I'm not quite sure why it would be different in 7x .vs. 6x. How much
> >> heap do you allocate to the JVM? And do you see similar heap dumps in
> >> 6.6?
> >>
> >> Best,
> >> Erick
> >> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]> wrote:
> >>>
> >>> Hello,
> >>>
> >>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We are running Zookeeper 4.1.13.
> >>>
> >>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space exhaustion. After obtaining a heap dump it looks like that we have a lot of IndexSearchers open for our largest collection.
> >>>
> >>> The dump contains around ~60 IndexSearchers, and each containing around ~40mb heap. Another 500MB of heap is the fieldcache, which is expected in my opinion.
> >>>
> >>> The current config can be found here: https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
> >>>
> >>> Analyzing the heap dump eclipse MAT says this:
> >>>
> >>> Problem Suspect 1
> >>>
> >>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.981.148.336 (38,26%) bytes.
> >>>
> >>> Biggest instances:
> >>>
> >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 - 70.087.272 (1,35%) bytes.
> >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 - 65.678.264 (1,27%) bytes.
> >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 - 63.050.600 (1,22%) bytes.
> >>>
> >>>
> >>> Problem Suspect 2
> >>>
> >>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy 1.373.110.208 (26,52%) bytes.
> >>>
> >>>
> >>> Any help is appreciated. Thank you very much!
> >>> Björn
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Heap Memory Problem after Upgrading to 7.4.0

Tomás Fernández Löbbe
I think this is pretty bad. I created
https://issues.apache.org/jira/browse/SOLR-12743. Feel free to add any more
details you have there.

On Mon, Sep 3, 2018 at 1:50 PM Markus Jelsma <[hidden email]>
wrote:

> Hello Björn,
>
> Take great care, 7.2.1 cannot read an index written by 7.4.0, so you
> cannot roll back but need to reindex!
>
> Andrey Kudryavtsev made a good suggestion in the thread on how to find the
> culprit, but it will be a tedious task. I have not yet had the time or
> courage to venture there.
>
> Hope it helps,
> Markus
>
>
>
> -----Original message-----
> > From:Björn Häuser <[hidden email]>
> > Sent: Monday 3rd September 2018 22:28
> > To: [hidden email]
> > Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> >
> > Hi Markus,
> >
> > this reads exactly like what we have. Where you able to figure out
> anything? Currently thinking about rollbacking to 7.2.1.
> >
> >
> >
> > > On 3. Sep 2018, at 21:54, Markus Jelsma <[hidden email]>
> wrote:
> > >
> > > Hello,
> > >
> > > Getting an OOM plus the fact you are having a lot of IndexSearcher
> instances rings a familiar bell. One of our collections has the same issue
> [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all
> our custom Solr code but had to keep our Lucene filters in the schema, the
> problem persisted.
> > >
> > > The odd thing, however, is that you appear to have the same problem,
> but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can
> you confirm the problem is not also in 7.3.0?
> > >
> >
> > We had very similar problems with 7.3.0 but never analyzed them and just
> updated to 7.4.0 because I thought thats the bug we hit:
> https://issues.apache.org/jira/browse/SOLR-11882 <
> https://issues.apache.org/jira/browse/SOLR-11882>
> >
> >
> > > You should see the instance count for IndexSearcher increase by one
> for each replica on each commit.
> >
> >
> > Sorry, where can I find this? ;) Sorry, did not find anything.
> >
> > Thanks
> > Björn
> >
> > >
> > > Regards,
> > > Markus
> > >
> > > [1]
> http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html
> > >
> > >
> > >
> > > -----Original message-----
> > >> From:Erick Erickson <[hidden email]>
> > >> Sent: Monday 3rd September 2018 20:49
> > >> To: solr-user <[hidden email]>
> > >> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> > >>
> > >> I would expect at least 1 IndexSearcher per replica, how many total
> > >> replicas hosted in your JVM?
> > >>
> > >> Plus, if you're actively indexing, there may temporarily be 2
> > >> IndexSearchers open while the new searcher warms.
> > >>
> > >> And there may be quite a few caches, at least queryResultCache and
> > >> filterCache and documentCache, one of each per replica and maybe two
> > >> (for queryResultCache and filterCache) if you have a background
> > >> searcher autowarming.
> > >>
> > >> At a glance, your autowarm counts are very high, so it may take some
> > >> time to autowarm leading to multiple IndexSearchers and caches open
> > >> per replica when you happen to hit a commit point. I usually start
> > >> with 16-20 as an autowarm count, the benefit decreases rapidly as you
> > >> increase the count.
> > >>
> > >> I'm not quite sure why it would be different in 7x .vs. 6x. How much
> > >> heap do you allocate to the JVM? And do you see similar heap dumps in
> > >> 6.6?
> > >>
> > >> Best,
> > >> Erick
> > >> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]>
> wrote:
> > >>>
> > >>> Hello,
> > >>>
> > >>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard
> each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We
> are running Zookeeper 4.1.13.
> > >>>
> > >>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space
> exhaustion. After obtaining a heap dump it looks like that we have a lot of
> IndexSearchers open for our largest collection.
> > >>>
> > >>> The dump contains around ~60 IndexSearchers, and each containing
> around ~40mb heap. Another 500MB of heap is the fieldcache, which is
> expected in my opinion.
> > >>>
> > >>> The current config can be found here:
> https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <
> https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
> > >>>
> > >>> Analyzing the heap dump eclipse MAT says this:
> > >>>
> > >>> Problem Suspect 1
> > >>>
> > >>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded
> by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy
> 1.981.148.336 (38,26%) bytes.
> > >>>
> > >>> Biggest instances:
> > >>>
> > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 -
> 70.087.272 (1,35%) bytes.
> > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 -
> 65.678.264 (1,27%) bytes.
> > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 -
> 63.050.600 (1,22%) bytes.
> > >>>
> > >>>
> > >>> Problem Suspect 2
> > >>>
> > >>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded
> by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy
> 1.373.110.208 (26,52%) bytes.
> > >>>
> > >>>
> > >>> Any help is appreciated. Thank you very much!
> > >>> Björn
> > >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Heap Memory Problem after Upgrading to 7.4.0

Markus Jelsma-2
Thanks Tomás!

Björn, can you reproduce the problem in a local and controlled environment?

Markus

 
 
-----Original message-----

> From:Tomás Fernández Löbbe <[hidden email]>
> Sent: Wednesday 5th September 2018 18:32
> To: [hidden email]
> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
>
> I think this is pretty bad. I created
> https://issues.apache.org/jira/browse/SOLR-12743. Feel free to add any more
> details you have there.
>
> On Mon, Sep 3, 2018 at 1:50 PM Markus Jelsma <[hidden email]>
> wrote:
>
> > Hello Björn,
> >
> > Take great care, 7.2.1 cannot read an index written by 7.4.0, so you
> > cannot roll back but need to reindex!
> >
> > Andrey Kudryavtsev made a good suggestion in the thread on how to find the
> > culprit, but it will be a tedious task. I have not yet had the time or
> > courage to venture there.
> >
> > Hope it helps,
> > Markus
> >
> >
> >
> > -----Original message-----
> > > From:Björn Häuser <[hidden email]>
> > > Sent: Monday 3rd September 2018 22:28
> > > To: [hidden email]
> > > Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> > >
> > > Hi Markus,
> > >
> > > this reads exactly like what we have. Where you able to figure out
> > anything? Currently thinking about rollbacking to 7.2.1.
> > >
> > >
> > >
> > > > On 3. Sep 2018, at 21:54, Markus Jelsma <[hidden email]>
> > wrote:
> > > >
> > > > Hello,
> > > >
> > > > Getting an OOM plus the fact you are having a lot of IndexSearcher
> > instances rings a familiar bell. One of our collections has the same issue
> > [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all
> > our custom Solr code but had to keep our Lucene filters in the schema, the
> > problem persisted.
> > > >
> > > > The odd thing, however, is that you appear to have the same problem,
> > but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can
> > you confirm the problem is not also in 7.3.0?
> > > >
> > >
> > > We had very similar problems with 7.3.0 but never analyzed them and just
> > updated to 7.4.0 because I thought thats the bug we hit:
> > https://issues.apache.org/jira/browse/SOLR-11882 <
> > https://issues.apache.org/jira/browse/SOLR-11882>
> > >
> > >
> > > > You should see the instance count for IndexSearcher increase by one
> > for each replica on each commit.
> > >
> > >
> > > Sorry, where can I find this? ;) Sorry, did not find anything.
> > >
> > > Thanks
> > > Björn
> > >
> > > >
> > > > Regards,
> > > > Markus
> > > >
> > > > [1]
> > http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html
> > > >
> > > >
> > > >
> > > > -----Original message-----
> > > >> From:Erick Erickson <[hidden email]>
> > > >> Sent: Monday 3rd September 2018 20:49
> > > >> To: solr-user <[hidden email]>
> > > >> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> > > >>
> > > >> I would expect at least 1 IndexSearcher per replica, how many total
> > > >> replicas hosted in your JVM?
> > > >>
> > > >> Plus, if you're actively indexing, there may temporarily be 2
> > > >> IndexSearchers open while the new searcher warms.
> > > >>
> > > >> And there may be quite a few caches, at least queryResultCache and
> > > >> filterCache and documentCache, one of each per replica and maybe two
> > > >> (for queryResultCache and filterCache) if you have a background
> > > >> searcher autowarming.
> > > >>
> > > >> At a glance, your autowarm counts are very high, so it may take some
> > > >> time to autowarm leading to multiple IndexSearchers and caches open
> > > >> per replica when you happen to hit a commit point. I usually start
> > > >> with 16-20 as an autowarm count, the benefit decreases rapidly as you
> > > >> increase the count.
> > > >>
> > > >> I'm not quite sure why it would be different in 7x .vs. 6x. How much
> > > >> heap do you allocate to the JVM? And do you see similar heap dumps in
> > > >> 6.6?
> > > >>
> > > >> Best,
> > > >> Erick
> > > >> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]>
> > wrote:
> > > >>>
> > > >>> Hello,
> > > >>>
> > > >>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard
> > each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We
> > are running Zookeeper 4.1.13.
> > > >>>
> > > >>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space
> > exhaustion. After obtaining a heap dump it looks like that we have a lot of
> > IndexSearchers open for our largest collection.
> > > >>>
> > > >>> The dump contains around ~60 IndexSearchers, and each containing
> > around ~40mb heap. Another 500MB of heap is the fieldcache, which is
> > expected in my opinion.
> > > >>>
> > > >>> The current config can be found here:
> > https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <
> > https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
> > > >>>
> > > >>> Analyzing the heap dump eclipse MAT says this:
> > > >>>
> > > >>> Problem Suspect 1
> > > >>>
> > > >>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded
> > by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy
> > 1.981.148.336 (38,26%) bytes.
> > > >>>
> > > >>> Biggest instances:
> > > >>>
> > > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 -
> > 70.087.272 (1,35%) bytes.
> > > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 -
> > 65.678.264 (1,27%) bytes.
> > > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 -
> > 63.050.600 (1,22%) bytes.
> > > >>>
> > > >>>
> > > >>> Problem Suspect 2
> > > >>>
> > > >>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded
> > by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy
> > 1.373.110.208 (26,52%) bytes.
> > > >>>
> > > >>>
> > > >>> Any help is appreciated. Thank you very much!
> > > >>> Björn
> > > >>
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Heap Memory Problem after Upgrading to 7.4.0

Erick Erickson
All:

Let's move the rest of the conversation over to the JIRA Tomás raised,
so we have a record of what's been attempted to track this.

It would be great if Markus and Björn could add some environment info
on the JIRA, in particular the version of Java you're both using and
the op system etc...

Thanks,
Erick
On Thu, Sep 6, 2018 at 1:29 AM Markus Jelsma <[hidden email]> wrote:

>
> Thanks Tomás!
>
> Björn, can you reproduce the problem in a local and controlled environment?
>
> Markus
>
>
>
> -----Original message-----
> > From:Tomás Fernández Löbbe <[hidden email]>
> > Sent: Wednesday 5th September 2018 18:32
> > To: [hidden email]
> > Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> >
> > I think this is pretty bad. I created
> > https://issues.apache.org/jira/browse/SOLR-12743. Feel free to add any more
> > details you have there.
> >
> > On Mon, Sep 3, 2018 at 1:50 PM Markus Jelsma <[hidden email]>
> > wrote:
> >
> > > Hello Björn,
> > >
> > > Take great care, 7.2.1 cannot read an index written by 7.4.0, so you
> > > cannot roll back but need to reindex!
> > >
> > > Andrey Kudryavtsev made a good suggestion in the thread on how to find the
> > > culprit, but it will be a tedious task. I have not yet had the time or
> > > courage to venture there.
> > >
> > > Hope it helps,
> > > Markus
> > >
> > >
> > >
> > > -----Original message-----
> > > > From:Björn Häuser <[hidden email]>
> > > > Sent: Monday 3rd September 2018 22:28
> > > > To: [hidden email]
> > > > Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> > > >
> > > > Hi Markus,
> > > >
> > > > this reads exactly like what we have. Where you able to figure out
> > > anything? Currently thinking about rollbacking to 7.2.1.
> > > >
> > > >
> > > >
> > > > > On 3. Sep 2018, at 21:54, Markus Jelsma <[hidden email]>
> > > wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > Getting an OOM plus the fact you are having a lot of IndexSearcher
> > > instances rings a familiar bell. One of our collections has the same issue
> > > [1] when we attempted an upgrade 7.2.1 > 7.3.0. I managed to rule out all
> > > our custom Solr code but had to keep our Lucene filters in the schema, the
> > > problem persisted.
> > > > >
> > > > > The odd thing, however, is that you appear to have the same problem,
> > > but not with 7.3.0? Since you shortly after 7.3.0 upgraded to 7.4.0, can
> > > you confirm the problem is not also in 7.3.0?
> > > > >
> > > >
> > > > We had very similar problems with 7.3.0 but never analyzed them and just
> > > updated to 7.4.0 because I thought thats the bug we hit:
> > > https://issues.apache.org/jira/browse/SOLR-11882 <
> > > https://issues.apache.org/jira/browse/SOLR-11882>
> > > >
> > > >
> > > > > You should see the instance count for IndexSearcher increase by one
> > > for each replica on each commit.
> > > >
> > > >
> > > > Sorry, where can I find this? ;) Sorry, did not find anything.
> > > >
> > > > Thanks
> > > > Björn
> > > >
> > > > >
> > > > > Regards,
> > > > > Markus
> > > > >
> > > > > [1]
> > > http://lucene.472066.n3.nabble.com/RE-7-3-appears-to-leak-td4396232.html
> > > > >
> > > > >
> > > > >
> > > > > -----Original message-----
> > > > >> From:Erick Erickson <[hidden email]>
> > > > >> Sent: Monday 3rd September 2018 20:49
> > > > >> To: solr-user <[hidden email]>
> > > > >> Subject: Re: Heap Memory Problem after Upgrading to 7.4.0
> > > > >>
> > > > >> I would expect at least 1 IndexSearcher per replica, how many total
> > > > >> replicas hosted in your JVM?
> > > > >>
> > > > >> Plus, if you're actively indexing, there may temporarily be 2
> > > > >> IndexSearchers open while the new searcher warms.
> > > > >>
> > > > >> And there may be quite a few caches, at least queryResultCache and
> > > > >> filterCache and documentCache, one of each per replica and maybe two
> > > > >> (for queryResultCache and filterCache) if you have a background
> > > > >> searcher autowarming.
> > > > >>
> > > > >> At a glance, your autowarm counts are very high, so it may take some
> > > > >> time to autowarm leading to multiple IndexSearchers and caches open
> > > > >> per replica when you happen to hit a commit point. I usually start
> > > > >> with 16-20 as an autowarm count, the benefit decreases rapidly as you
> > > > >> increase the count.
> > > > >>
> > > > >> I'm not quite sure why it would be different in 7x .vs. 6x. How much
> > > > >> heap do you allocate to the JVM? And do you see similar heap dumps in
> > > > >> 6.6?
> > > > >>
> > > > >> Best,
> > > > >> Erick
> > > > >> On Mon, Sep 3, 2018 at 10:33 AM Björn Häuser <[hidden email]>
> > > wrote:
> > > > >>>
> > > > >>> Hello,
> > > > >>>
> > > > >>> we recently upgraded our solrcloud (5 nodes, 25 collections, 1 shard
> > > each, 4 replicas each) from 6.6.0 to 7.3.0 and shortly after to 7.4.0. We
> > > are running Zookeeper 4.1.13.
> > > > >>>
> > > > >>> Since the upgrade to 7.3.0 and also 7.4.0 we encountering heap space
> > > exhaustion. After obtaining a heap dump it looks like that we have a lot of
> > > IndexSearchers open for our largest collection.
> > > > >>>
> > > > >>> The dump contains around ~60 IndexSearchers, and each containing
> > > around ~40mb heap. Another 500MB of heap is the fieldcache, which is
> > > expected in my opinion.
> > > > >>>
> > > > >>> The current config can be found here:
> > > https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844 <
> > > https://gist.github.com/bjoernhaeuser/327a65291ac9793e744b87f0a561e844>
> > > > >>>
> > > > >>> Analyzing the heap dump eclipse MAT says this:
> > > > >>>
> > > > >>> Problem Suspect 1
> > > > >>>
> > > > >>> 91 instances of "org.apache.solr.search.SolrIndexSearcher", loaded
> > > by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy
> > > 1.981.148.336 (38,26%) bytes.
> > > > >>>
> > > > >>> Biggest instances:
> > > > >>>
> > > > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6ffd47ea8 -
> > > 70.087.272 (1,35%) bytes.
> > > > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x79ea9c040 -
> > > 65.678.264 (1,27%) bytes.
> > > > >>>        • org.apache.solr.search.SolrIndexSearcher @ 0x6855ad680 -
> > > 63.050.600 (1,22%) bytes.
> > > > >>>
> > > > >>>
> > > > >>> Problem Suspect 2
> > > > >>>
> > > > >>> 223 instances of "org.apache.solr.util.ConcurrentLRUCache", loaded
> > > by "org.eclipse.jetty.webapp.WebAppClassLoader @ 0x6807d1048" occupy
> > > 1.373.110.208 (26,52%) bytes.
> > > > >>>
> > > > >>>
> > > > >>> Any help is appreciated. Thank you very much!
> > > > >>> Björn
> > > > >>
> > > >
> > > >
> > >
> >