Solr-8.1.0 uses much more memory

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr-8.1.0 uses much more memory

Joe Doupnik
     Comparing memory consumption (real, not virtual) of quiesent Solr
v8.0 and prior with Solr v8.1.0 reveals the older versions use about
1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used are SUSE
Linux, with Oracle JDK v1.8 and openjdk v10. This is a major memory
consumption issue. I have seen no mention of it in the docs nor forums.
     Thanks,
     Joe D.
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Shawn Heisey-2
On 5/25/2019 9:40 AM, Joe Doupnik wrote:
>      Comparing memory consumption (real, not virtual) of quiesent Solr
> v8.0 and prior with Solr v8.1.0 reveals the older versions use about
> 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used are SUSE
> Linux, with Oracle JDK v1.8 and openjdk v10. This is a major memory
> consumption issue. I have seen no mention of it in the docs nor forums.

If Solr is using 4 to 5 GB of memory on your system, it is only doing
that because you told it that it was allowed to.

If you run a Java program with a minimum heap that's smaller than the
max heap, which Solr does not do by default, then what you will find is
that Java *might* stay lower than the maximum for a while.  But
eventually it WILL allocate the entire maximum heap from the OS, plus
some extra for Java itself to work with.  Solr 8.0 and Solr 8.1 are not
different from each other in this regard.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
On 26/05/2019 19:08, Shawn Heisey wrote:

> On 5/25/2019 9:40 AM, Joe Doupnik wrote:
>>      Comparing memory consumption (real, not virtual) of quiesent
>> Solr v8.0 and prior with Solr v8.1.0 reveals the older versions use
>> about 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used
>> are SUSE Linux, with Oracle JDK v1.8 and openjdk v10. This is a major
>> memory consumption issue. I have seen no mention of it in the docs
>> nor forums.
>
> If Solr is using 4 to 5 GB of memory on your system, it is only doing
> that because you told it that it was allowed to.
>
> If you run a Java program with a minimum heap that's smaller than the
> max heap, which Solr does not do by default, then what you will find
> is that Java *might* stay lower than the maximum for a while.  But
> eventually it WILL allocate the entire maximum heap from the OS, plus
> some extra for Java itself to work with.  Solr 8.0 and Solr 8.1 are
> not different from each other in this regard.
>
> Thanks,
> Shawn
--------
     Not to be argumentative, prior to Solr v8.1 quiesent resident
memory remained at about the 1.6GB level, and during active indexing it
could exceed 3.5GB. With the same configuration settings Solr v8.1
changes that to use _a lot_ more memory. Thus something significant has
changed with Solr v8.1 when compared to its predecessors. The question
is what, and what can we do about it.
     I am not about to enter a guessing game with Solr and Java and its
heap usage. That is far to complex to hope to win.
     Thus, something changed, for the worse here in the field, and I do
not know what.
     Thanks,
     Joe D.
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
On 26/05/2019 19:15, Joe Doupnik wrote:

> On 26/05/2019 19:08, Shawn Heisey wrote:
>> On 5/25/2019 9:40 AM, Joe Doupnik wrote:
>>>      Comparing memory consumption (real, not virtual) of quiesent
>>> Solr v8.0 and prior with Solr v8.1.0 reveals the older versions use
>>> about 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used
>>> are SUSE Linux, with Oracle JDK v1.8 and openjdk v10. This is a
>>> major memory consumption issue. I have seen no mention of it in the
>>> docs nor forums.
>>
>> If Solr is using 4 to 5 GB of memory on your system, it is only doing
>> that because you told it that it was allowed to.
>>
>> If you run a Java program with a minimum heap that's smaller than the
>> max heap, which Solr does not do by default, then what you will find
>> is that Java *might* stay lower than the maximum for a while.  But
>> eventually it WILL allocate the entire maximum heap from the OS, plus
>> some extra for Java itself to work with. Solr 8.0 and Solr 8.1 are
>> not different from each other in this regard.
>>
>> Thanks,
>> Shawn
> --------
>     Not to be argumentative, prior to Solr v8.1 quiesent resident
> memory remained at about the 1.6GB level, and during active indexing
> it could exceed 3.5GB. With the same configuration settings Solr v8.1
> changes that to use _a lot_ more memory. Thus something significant
> has changed with Solr v8.1 when compared to its predecessors. The
> question is what, and what can we do about it.
>     I am not about to enter a guessing game with Solr and Java and its
> heap usage. That is far to complex to hope to win.
>     Thus, something changed, for the worse here in the field, and I do
> not know what.
>     Thanks,
>     Joe D.
---------------
     If I were forced to guess about this situation it woud be to flag
an item mentioned vaguely in passing: the garbage collector. How to
return it to status quo ante is not known here. Presumably such a step
would be covered in the yet to appear documentation for Solr v8.1
     To add a little more to the story. Memory remained at the 1.6GB
level except when doing heavy indexing. To "adjust" Solr so that it
always consumes too much, as at present, is not acceptable, nor is
acceptable risking trouble by setting an upper limit down to say 1.6GB
and thence cause indexing to fail.
     We see the dilemna. Expert assistance is needed to resolve this.
     Thanks,
     Joe D.
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Jörn Franke
In reply to this post by Joe Doupnik
Different garbage collector configuration? It does not mean that Solr uses more memory if it is occupied - it could also mean that the JVM just kept it reserved for future memory needs.

> Am 25.05.2019 um 17:40 schrieb Joe Doupnik <[hidden email]>:
>
>     Comparing memory consumption (real, not virtual) of quiesent Solr v8.0 and prior with Solr v8.1.0 reveals the older versions use about 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used are SUSE Linux, with Oracle JDK v1.8 and openjdk v10. This is a major memory consumption issue. I have seen no mention of it in the docs nor forums.
>     Thanks,
>     Joe D.
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Jörn Franke
In reply to this post by Joe Doupnik
I think this is also a very risky memory strategy. What happens if you Index and query at the same time etc. maybe it is more worth to provide as much memory as for concurrent operations are needed. This includes JVM memory but also the disk caches.

> Am 26.05.2019 um 20:38 schrieb Joe Doupnik <[hidden email]>:
>
>> On 26/05/2019 19:15, Joe Doupnik wrote:
>>> On 26/05/2019 19:08, Shawn Heisey wrote:
>>>> On 5/25/2019 9:40 AM, Joe Doupnik wrote:
>>>>      Comparing memory consumption (real, not virtual) of quiesent Solr v8.0 and prior with Solr v8.1.0 reveals the older versions use about 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used are SUSE Linux, with Oracle JDK v1.8 and openjdk v10. This is a major memory consumption issue. I have seen no mention of it in the docs nor forums.
>>>
>>> If Solr is using 4 to 5 GB of memory on your system, it is only doing that because you told it that it was allowed to.
>>>
>>> If you run a Java program with a minimum heap that's smaller than the max heap, which Solr does not do by default, then what you will find is that Java *might* stay lower than the maximum for a while.  But eventually it WILL allocate the entire maximum heap from the OS, plus some extra for Java itself to work with. Solr 8.0 and Solr 8.1 are not different from each other in this regard.
>>>
>>> Thanks,
>>> Shawn
>> --------
>>     Not to be argumentative, prior to Solr v8.1 quiesent resident memory remained at about the 1.6GB level, and during active indexing it could exceed 3.5GB. With the same configuration settings Solr v8.1 changes that to use _a lot_ more memory. Thus something significant has changed with Solr v8.1 when compared to its predecessors. The question is what, and what can we do about it.
>>     I am not about to enter a guessing game with Solr and Java and its heap usage. That is far to complex to hope to win.
>>     Thus, something changed, for the worse here in the field, and I do not know what.
>>     Thanks,
>>     Joe D.
> ---------------
>     If I were forced to guess about this situation it woud be to flag an item mentioned vaguely in passing: the garbage collector. How to return it to status quo ante is not known here. Presumably such a step would be covered in the yet to appear documentation for Solr v8.1
>     To add a little more to the story. Memory remained at the 1.6GB level except when doing heavy indexing. To "adjust" Solr so that it always consumes too much, as at present, is not acceptable, nor is acceptable risking trouble by setting an upper limit down to say 1.6GB and thence cause indexing to fail.
>     We see the dilemna. Expert assistance is needed to resolve this.
>     Thanks,
>     Joe D.
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
In reply to this post by Jörn Franke
On 26/05/2019 19:38, Jörn Franke wrote:
> Different garbage collector configuration? It does not mean that Solr uses more memory if it is occupied - it could also mean that the JVM just kept it reserved for future memory needs.
>
>> Am 25.05.2019 um 17:40 schrieb Joe Doupnik <[hidden email]>:
>>
>>      Comparing memory consumption (real, not virtual) of quiesent Solr v8.0 and prior with Solr v8.1.0 reveals the older versions use about 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used are SUSE Linux, with Oracle JDK v1.8 and openjdk v10. This is a major memory consumption issue. I have seen no mention of it in the docs nor forums.
>>      Thanks,
>>      Joe D.
-------
     The garbage collector was on my mind as well (in a msg sent just
before yours). These numbers are easy to verify, just by using "top".
They say allocated, meaning Java owns it, no matter what Java does with
it. Java does not own the machine; there are other useful activities to
tend as well.
     Let's find the problem and cure it.
     Thanks,
     Joe D.
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
In reply to this post by Jörn Franke
     I do queries while indexing, have done so for a long time, without
difficulty nor memory usage spikes from dual use. The system has been
designed to support that.
     Again, one may look at the numbers using "top" or similar. Try Solr
v8.0 and 8.1 to see the difference which I experience here. For
reference, the only memory adjustables set in my configuration is in the
Solr startup script solr.in.sh saying add "-Xss1024k" in the SOLR_OPTS
list and setting SOLR_HEAP="4024m".
     Thanks,
     Joe D.

On 26/05/2019 19:43, Jörn Franke wrote:

> I think this is also a very risky memory strategy. What happens if you Index and query at the same time etc. maybe it is more worth to provide as much memory as for concurrent operations are needed. This includes JVM memory but also the disk caches.
>
>> Am 26.05.2019 um 20:38 schrieb Joe Doupnik <[hidden email]>:
>>
>>> On 26/05/2019 19:15, Joe Doupnik wrote:
>>>> On 26/05/2019 19:08, Shawn Heisey wrote:
>>>>> On 5/25/2019 9:40 AM, Joe Doupnik wrote:
>>>>>       Comparing memory consumption (real, not virtual) of quiesent Solr v8.0 and prior with Solr v8.1.0 reveals the older versions use about 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used are SUSE Linux, with Oracle JDK v1.8 and openjdk v10. This is a major memory consumption issue. I have seen no mention of it in the docs nor forums.
>>>> If Solr is using 4 to 5 GB of memory on your system, it is only doing that because you told it that it was allowed to.
>>>>
>>>> If you run a Java program with a minimum heap that's smaller than the max heap, which Solr does not do by default, then what you will find is that Java *might* stay lower than the maximum for a while.  But eventually it WILL allocate the entire maximum heap from the OS, plus some extra for Java itself to work with. Solr 8.0 and Solr 8.1 are not different from each other in this regard.
>>>>
>>>> Thanks,
>>>> Shawn
>>> --------
>>>      Not to be argumentative, prior to Solr v8.1 quiesent resident memory remained at about the 1.6GB level, and during active indexing it could exceed 3.5GB. With the same configuration settings Solr v8.1 changes that to use _a lot_ more memory. Thus something significant has changed with Solr v8.1 when compared to its predecessors. The question is what, and what can we do about it.
>>>      I am not about to enter a guessing game with Solr and Java and its heap usage. That is far to complex to hope to win.
>>>      Thus, something changed, for the worse here in the field, and I do not know what.
>>>      Thanks,
>>>      Joe D.
>> ---------------
>>      If I were forced to guess about this situation it woud be to flag an item mentioned vaguely in passing: the garbage collector. How to return it to status quo ante is not known here. Presumably such a step would be covered in the yet to appear documentation for Solr v8.1
>>      To add a little more to the story. Memory remained at the 1.6GB level except when doing heavy indexing. To "adjust" Solr so that it always consumes too much, as at present, is not acceptable, nor is acceptable risking trouble by setting an upper limit down to say 1.6GB and thence cause indexing to fail.
>>      We see the dilemna. Expert assistance is needed to resolve this.
>>      Thanks,
>>      Joe D.

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Shawn Heisey-2
On 5/26/2019 12:52 PM, Joe Doupnik wrote:
>      I do queries while indexing, have done so for a long time, without
> difficulty nor memory usage spikes from dual use. The system has been
> designed to support that.
>      Again, one may look at the numbers using "top" or similar. Try Solr
> v8.0 and 8.1 to see the difference which I experience here. For
> reference, the only memory adjustables set in my configuration is in the
> Solr startup script solr.in.sh saying add "-Xss1024k" in the SOLR_OPTS
> list and setting SOLR_HEAP="4024m".

There is one significant difference between 8.0 and 8.1 in the realm of
memory management -- we have switched from the CMS garbage collector to
the G1 collector.  So the way that Java manages the heap has changed.
This was done because the CMS collector is slated for removal from Java.

https://issues.apache.org/jira/browse/SOLR-13394

Java is unlike other programs in one respect -- once it allocates heap
from the OS, it never gives it back.  This behavior has given Java an
undeserved reputation as a memory hog ... but in fact Java's overall
memory usage can be very easily limited ... an option that many other
programs do NOT have.

In your configuration, you set the max heap to a little less than 4GB.
You have to expect that it *WILL* use that memory.  By using the
SOLR_HEAP variable, you have instructed Solr's startup script to use the
same setting for the minimum heap as well as the maximum heap.  This is
the design intent.

If you want to know how much heap is being used, you can't ask the
operating system, which means tools like top.  You have to ask Java.
And you will have to look at a long-term graph, finding the low points.
An instananeous look at Java's heap usage could show you that the whole
heap is allocated ... but a significant part of that allocation could be
garbage, which becomes available once the garbage is collected.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joel Bernstein
I'm not sure this issue applies in this situation but it's worth taking a
look at:

https://issues.apache.org/jira/browse/SOLR-12833?focusedCommentId=16807868&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16807868

Although the memory issue in the ticket involves different versions than I
think are being discussed. It's good to understand that this issue exists
and that it's resolved going forward.

Also because the way that this issue is attached to the original ticket
that caused the bug, rather than a new bug report, it's very hard to know
that this problem actually existed.








Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, May 26, 2019 at 3:30 PM Shawn Heisey <[hidden email]> wrote:

> On 5/26/2019 12:52 PM, Joe Doupnik wrote:
> >      I do queries while indexing, have done so for a long time, without
> > difficulty nor memory usage spikes from dual use. The system has been
> > designed to support that.
> >      Again, one may look at the numbers using "top" or similar. Try Solr
> > v8.0 and 8.1 to see the difference which I experience here. For
> > reference, the only memory adjustables set in my configuration is in the
> > Solr startup script solr.in.sh saying add "-Xss1024k" in the SOLR_OPTS
> > list and setting SOLR_HEAP="4024m".
>
> There is one significant difference between 8.0 and 8.1 in the realm of
> memory management -- we have switched from the CMS garbage collector to
> the G1 collector.  So the way that Java manages the heap has changed.
> This was done because the CMS collector is slated for removal from Java.
>
> https://issues.apache.org/jira/browse/SOLR-13394
>
> Java is unlike other programs in one respect -- once it allocates heap
> from the OS, it never gives it back.  This behavior has given Java an
> undeserved reputation as a memory hog ... but in fact Java's overall
> memory usage can be very easily limited ... an option that many other
> programs do NOT have.
>
> In your configuration, you set the max heap to a little less than 4GB.
> You have to expect that it *WILL* use that memory.  By using the
> SOLR_HEAP variable, you have instructed Solr's startup script to use the
> same setting for the minimum heap as well as the maximum heap.  This is
> the design intent.
>
> If you want to know how much heap is being used, you can't ask the
> operating system, which means tools like top.  You have to ask Java.
> And you will have to look at a long-term graph, finding the low points.
> An instananeous look at Java's heap usage could show you that the whole
> heap is allocated ... but a significant part of that allocation could be
> garbage, which becomes available once the garbage is collected.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
In reply to this post by Shawn Heisey-2
     Generalizations tend to fail when confronted with conflicting
evidence. The simple  evidence is asking how much real memory the Solr
owned process has been allocated (top, or ps aux or similar) and that
yields two very different values (the ~1.6GB of Solr v8.0 and 4.5+GB of
Solr v8.1). I have no knowledge of how Java chooses to name its usage
(heap or otherwise). Prior to v8.1 Solr memory consumption varied with
activity, thus memory management was occuring, memory was borrowed from
and returned to the system. What might be happening in Solr v8.1 is the
new memory management code is failing to do a proper job, for reasons
which are not visible to us in the field, and that failure is important
to us.
     In regard to the referenced lock discussion, it would be a good
idea to not let the tail wag the dog, tend the common cases and live
with a few corner case difficulties because perfection is not possible.
     Thanks,
     Joe D.

On 26/05/2019 20:30, Shawn Heisey wrote:

> On 5/26/2019 12:52 PM, Joe Doupnik wrote:
>>      I do queries while indexing, have done so for a long time,
>> without difficulty nor memory usage spikes from dual use. The system
>> has been designed to support that.
>>      Again, one may look at the numbers using "top" or similar. Try
>> Solr v8.0 and 8.1 to see the difference which I experience here. For
>> reference, the only memory adjustables set in my configuration is in
>> the Solr startup script solr.in.sh saying add "-Xss1024k" in the
>> SOLR_OPTS list and setting SOLR_HEAP="4024m".
>
> There is one significant difference between 8.0 and 8.1 in the realm
> of memory management -- we have switched from the CMS garbage
> collector to the G1 collector.  So the way that Java manages the heap
> has changed. This was done because the CMS collector is slated for
> removal from Java.
>
> https://issues.apache.org/jira/browse/SOLR-13394
>
> Java is unlike other programs in one respect -- once it allocates heap
> from the OS, it never gives it back.  This behavior has given Java an
> undeserved reputation as a memory hog ... but in fact Java's overall
> memory usage can be very easily limited ... an option that many other
> programs do NOT have.
>
> In your configuration, you set the max heap to a little less than 4GB.
> You have to expect that it *WILL* use that memory.  By using the
> SOLR_HEAP variable, you have instructed Solr's startup script to use
> the same setting for the minimum heap as well as the maximum heap. 
> This is the design intent.
>
> If you want to know how much heap is being used, you can't ask the
> operating system, which means tools like top.  You have to ask Java.
> And you will have to look at a long-term graph, finding the low
> points. An instananeous look at Java's heap usage could show you that
> the whole heap is allocated ... but a significant part of that
> allocation could be garbage, which becomes available once the garbage
> is collected.
>
> Thanks,
> Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
     While on the topic of resource consumption and locks etc, there is
one other aspect to which Solr has been vulnerable. It is failing to
fend off too many requests at one time. The standard approach is, of
course, named back pressure, such as not replying to a query until
resources permit and thus keeping competion outside of the application.
That limits resource consumption, including locks, memory and sundry,
while permiting normal work within to progress smoothly. Let the crowds
coming to a hit show queue in the rain outside the theatre until empty
seats become available.

On 27/05/2019 08:52, Joe Doupnik wrote:

> Generalizations tend to fail when confronted with conflicting
> evidence. The simple  evidence is asking how much real memory the Solr
> owned process has been allocated (top, or ps aux or similar) and that
> yields two very different values (the ~1.6GB of Solr v8.0 and 4.5+GB
> of Solr v8.1). I have no knowledge of how Java chooses to name its
> usage (heap or otherwise). Prior to v8.1 Solr memory consumption
> varied with activity, thus memory management was occuring, memory was
> borrowed from and returned to the system. What might be happening in
> Solr v8.1 is the new memory management code is failing to do a proper
> job, for reasons which are not visible to us in the field, and that
> failure is important to us.
>     In regard to the referenced lock discussion, it would be a good
> idea to not let the tail wag the dog, tend the common cases and live
> with a few corner case difficulties because perfection is not possible.
>     Thanks,
>     Joe D.
>
> On 26/05/2019 20:30, Shawn Heisey wrote:
>> On 5/26/2019 12:52 PM, Joe Doupnik wrote:
>>>      I do queries while indexing, have done so for a long time,
>>> without difficulty nor memory usage spikes from dual use. The system
>>> has been designed to support that.
>>>      Again, one may look at the numbers using "top" or similar. Try
>>> Solr v8.0 and 8.1 to see the difference which I experience here. For
>>> reference, the only memory adjustables set in my configuration is in
>>> the Solr startup script solr.in.sh saying add "-Xss1024k" in the
>>> SOLR_OPTS list and setting SOLR_HEAP="4024m".
>>
>> There is one significant difference between 8.0 and 8.1 in the realm
>> of memory management -- we have switched from the CMS garbage
>> collector to the G1 collector.  So the way that Java manages the heap
>> has changed. This was done because the CMS collector is slated for
>> removal from Java.
>>
>> https://issues.apache.org/jira/browse/SOLR-13394
>>
>> Java is unlike other programs in one respect -- once it allocates
>> heap from the OS, it never gives it back.  This behavior has given
>> Java an undeserved reputation as a memory hog ... but in fact Java's
>> overall memory usage can be very easily limited ... an option that
>> many other programs do NOT have.
>>
>> In your configuration, you set the max heap to a little less than
>> 4GB. You have to expect that it *WILL* use that memory.  By using the
>> SOLR_HEAP variable, you have instructed Solr's startup script to use
>> the same setting for the minimum heap as well as the maximum heap. 
>> This is the design intent.
>>
>> If you want to know how much heap is being used, you can't ask the
>> operating system, which means tools like top.  You have to ask Java.
>> And you will have to look at a long-term graph, finding the low
>> points. An instananeous look at Java's heap usage could show you that
>> the whole heap is allocated ... but a significant part of that
>> allocation could be garbage, which becomes available once the garbage
>> is collected.
>>
>> Thanks,
>> Shawn
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Bernd Fehling
In reply to this post by Joe Doupnik
I think it is not fair blaiming Solr not also having a load balancer.
It is up to you and your needs to set up the required infrastucture
including load balancing. The are many products available on the market.
If your current system can't handle all requests then install more replicas.

Regards
Bernd

Am 27.05.19 um 10:33 schrieb Joe Doupnik:

>      While on the topic of resource consumption and locks etc, there is one other aspect to which Solr has been vulnerable. It is failing to
> fend off too many requests at one time. The standard approach is, of course, named back pressure, such as not replying to a query until
> resources permit and thus keeping competion outside of the application. That limits resource consumption, including locks, memory and sundry,
> while permiting normal work within to progress smoothly. Let the crowds coming to a hit show queue in the rain outside the theatre until empty
> seats become available.
>
> On 27/05/2019 08:52, Joe Doupnik wrote:
>> Generalizations tend to fail when confronted with conflicting evidence. The simple  evidence is asking how much real memory the Solr owned
>> process has been allocated (top, or ps aux or similar) and that yields two very different values (the ~1.6GB of Solr v8.0 and 4.5+GB of Solr
>> v8.1). I have no knowledge of how Java chooses to name its usage (heap or otherwise). Prior to v8.1 Solr memory consumption varied with
>> activity, thus memory management was occuring, memory was borrowed from and returned to the system. What might be happening in Solr v8.1 is
>> the new memory management code is failing to do a proper job, for reasons which are not visible to us in the field, and that failure is
>> important to us.
>>     In regard to the referenced lock discussion, it would be a good idea to not let the tail wag the dog, tend the common cases and live with
>> a few corner case difficulties because perfection is not possible.
>>     Thanks,
>>     Joe D.
>>
>> On 26/05/2019 20:30, Shawn Heisey wrote:
>>> On 5/26/2019 12:52 PM, Joe Doupnik wrote:
>>>>      I do queries while indexing, have done so for a long time, without difficulty nor memory usage spikes from dual use. The system has
>>>> been designed to support that.
>>>>      Again, one may look at the numbers using "top" or similar. Try Solr v8.0 and 8.1 to see the difference which I experience here. For
>>>> reference, the only memory adjustables set in my configuration is in the Solr startup script solr.in.sh saying add "-Xss1024k" in the
>>>> SOLR_OPTS list and setting SOLR_HEAP="4024m".
>>>
>>> There is one significant difference between 8.0 and 8.1 in the realm of memory management -- we have switched from the CMS garbage collector
>>> to the G1 collector.  So the way that Java manages the heap has changed. This was done because the CMS collector is slated for removal from
>>> Java.
>>>
>>> https://issues.apache.org/jira/browse/SOLR-13394
>>>
>>> Java is unlike other programs in one respect -- once it allocates heap from the OS, it never gives it back.  This behavior has given Java an
>>> undeserved reputation as a memory hog ... but in fact Java's overall memory usage can be very easily limited ... an option that many other
>>> programs do NOT have.
>>>
>>> In your configuration, you set the max heap to a little less than 4GB. You have to expect that it *WILL* use that memory.  By using the
>>> SOLR_HEAP variable, you have instructed Solr's startup script to use the same setting for the minimum heap as well as the maximum heap. This
>>> is the design intent.
>>>
>>> If you want to know how much heap is being used, you can't ask the operating system, which means tools like top.  You have to ask Java. And
>>> you will have to look at a long-term graph, finding the low points. An instananeous look at Java's heap usage could show you that the whole
>>> heap is allocated ... but a significant part of that allocation could be garbage, which becomes available once the garbage is collected.
>>>
>>> Thanks,
>>> Shawn
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
     You are certainly correct about using external load balancers when
appropriate. However, a basic problem with servers, that of accepting
more incoming items than can be handled gracefully is as we know an
age-old one and solved by back pressure methods (particularly hard
limits). My experience with Solr suggests that parts (say Tika) are
being too nice to incoming material, letting too many items enter the
application, consume resources, and so forth which then become awkward
to handle (see the locks item discussion cited earlier). Entry ought to
be blocked until the processing structure declares that resources are
available to accept new entries (a full but not overfull pipeline).
Those internal issues, locks, memory and similar, are resolvable when
limits are imposed. Also, with limits then your mentioned load balancers
stand a chance of sensing when a particular server is currently not
accepting new requests. Establishing limits does take some creative
thinking about how the system as a whole is constructed.
     I brought up the overload case because it pertains to this main
memory management thread.
     Thanks,
     Joe D.

On 27/05/2019 10:21, Bernd Fehling wrote:

> I think it is not fair blaiming Solr not also having a load balancer.
> It is up to you and your needs to set up the required infrastucture
> including load balancing. The are many products available on the market.
> If your current system can't handle all requests then install more
> replicas.
>
> Regards
> Bernd
>
> Am 27.05.19 um 10:33 schrieb Joe Doupnik:
>>      While on the topic of resource consumption and locks etc, there
>> is one other aspect to which Solr has been vulnerable. It is failing
>> to fend off too many requests at one time. The standard approach is,
>> of course, named back pressure, such as not replying to a query until
>> resources permit and thus keeping competion outside of the
>> application. That limits resource consumption, including locks,
>> memory and sundry, while permiting normal work within to progress
>> smoothly. Let the crowds coming to a hit show queue in the rain
>> outside the theatre until empty seats become available.
>>
>> On 27/05/2019 08:52, Joe Doupnik wrote:
>>> Generalizations tend to fail when confronted with conflicting
>>> evidence. The simple  evidence is asking how much real memory the
>>> Solr owned process has been allocated (top, or ps aux or similar)
>>> and that yields two very different values (the ~1.6GB of Solr v8.0
>>> and 4.5+GB of Solr v8.1). I have no knowledge of how Java chooses to
>>> name its usage (heap or otherwise). Prior to v8.1 Solr memory
>>> consumption varied with activity, thus memory management was
>>> occuring, memory was borrowed from and returned to the system. What
>>> might be happening in Solr v8.1 is the new memory management code is
>>> failing to do a proper job, for reasons which are not visible to us
>>> in the field, and that failure is important to us.
>>>     In regard to the referenced lock discussion, it would be a good
>>> idea to not let the tail wag the dog, tend the common cases and live
>>> with a few corner case difficulties because perfection is not possible.
>>>     Thanks,
>>>     Joe D.
>>>
>>> On 26/05/2019 20:30, Shawn Heisey wrote:
>>>> On 5/26/2019 12:52 PM, Joe Doupnik wrote:
>>>>>      I do queries while indexing, have done so for a long time,
>>>>> without difficulty nor memory usage spikes from dual use. The
>>>>> system has been designed to support that.
>>>>>      Again, one may look at the numbers using "top" or similar.
>>>>> Try Solr v8.0 and 8.1 to see the difference which I experience
>>>>> here. For reference, the only memory adjustables set in my
>>>>> configuration is in the Solr startup script solr.in.sh saying add
>>>>> "-Xss1024k" in the SOLR_OPTS list and setting SOLR_HEAP="4024m".
>>>>
>>>> There is one significant difference between 8.0 and 8.1 in the
>>>> realm of memory management -- we have switched from the CMS garbage
>>>> collector to the G1 collector.  So the way that Java manages the
>>>> heap has changed. This was done because the CMS collector is slated
>>>> for removal from Java.
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-13394
>>>>
>>>> Java is unlike other programs in one respect -- once it allocates
>>>> heap from the OS, it never gives it back.  This behavior has given
>>>> Java an undeserved reputation as a memory hog ... but in fact
>>>> Java's overall memory usage can be very easily limited ... an
>>>> option that many other programs do NOT have.
>>>>
>>>> In your configuration, you set the max heap to a little less than
>>>> 4GB. You have to expect that it *WILL* use that memory.  By using
>>>> the SOLR_HEAP variable, you have instructed Solr's startup script
>>>> to use the same setting for the minimum heap as well as the maximum
>>>> heap. This is the design intent.
>>>>
>>>> If you want to know how much heap is being used, you can't ask the
>>>> operating system, which means tools like top.  You have to ask
>>>> Java. And you will have to look at a long-term graph, finding the
>>>> low points. An instananeous look at Java's heap usage could show
>>>> you that the whole heap is allocated ... but a significant part of
>>>> that allocation could be garbage, which becomes available once the
>>>> garbage is collected.
>>>>
>>>> Thanks,
>>>> Shawn
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Walter Underwood
Solr really should use a limited pool for handling external requests. We’ve driven it into OOM a few times with too much traffic, just creating a useless number of threads.

But that requires separate pools for external requests and cluster-internal requests, which would probably require separate ports for external and internal.

We’ve considered running a local copy of nginx on each server, exposing that different port as the external port, and using nginx to limit traffic. But Solr really should not create thousands of internal threads then fall over. That is just dumb.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On May 27, 2019, at 3:05 AM, Joe Doupnik <[hidden email]> wrote:
>
>     You are certainly correct about using external load balancers when appropriate. However, a basic problem with servers, that of accepting more incoming items than can be handled gracefully is as we know an age-old one and solved by back pressure methods (particularly hard limits). My experience with Solr suggests that parts (say Tika) are being too nice to incoming material, letting too many items enter the application, consume resources, and so forth which then become awkward to handle (see the locks item discussion cited earlier). Entry ought to be blocked until the processing structure declares that resources are available to accept new entries (a full but not overfull pipeline). Those internal issues, locks, memory and similar, are resolvable when limits are imposed. Also, with limits then your mentioned load balancers stand a chance of sensing when a particular server is currently not accepting new requests. Establishing limits does take some creative thinking about how the system as a whole is constructed.
>     I brought up the overload case because it pertains to this main memory management thread.
>     Thanks,
>     Joe D.
>
> On 27/05/2019 10:21, Bernd Fehling wrote:
>> I think it is not fair blaiming Solr not also having a load balancer.
>> It is up to you and your needs to set up the required infrastucture
>> including load balancing. The are many products available on the market.
>> If your current system can't handle all requests then install more replicas.
>>
>> Regards
>> Bernd
>>
>> Am 27.05.19 um 10:33 schrieb Joe Doupnik:
>>>      While on the topic of resource consumption and locks etc, there is one other aspect to which Solr has been vulnerable. It is failing to fend off too many requests at one time. The standard approach is, of course, named back pressure, such as not replying to a query until resources permit and thus keeping competion outside of the application. That limits resource consumption, including locks, memory and sundry, while permiting normal work within to progress smoothly. Let the crowds coming to a hit show queue in the rain outside the theatre until empty seats become available.
>>>
>>> On 27/05/2019 08:52, Joe Doupnik wrote:
>>>> Generalizations tend to fail when confronted with conflicting evidence. The simple  evidence is asking how much real memory the Solr owned process has been allocated (top, or ps aux or similar) and that yields two very different values (the ~1.6GB of Solr v8.0 and 4.5+GB of Solr v8.1). I have no knowledge of how Java chooses to name its usage (heap or otherwise). Prior to v8.1 Solr memory consumption varied with activity, thus memory management was occuring, memory was borrowed from and returned to the system. What might be happening in Solr v8.1 is the new memory management code is failing to do a proper job, for reasons which are not visible to us in the field, and that failure is important to us.
>>>>     In regard to the referenced lock discussion, it would be a good idea to not let the tail wag the dog, tend the common cases and live with a few corner case difficulties because perfection is not possible.
>>>>     Thanks,
>>>>     Joe D.
>>>>
>>>> On 26/05/2019 20:30, Shawn Heisey wrote:
>>>>> On 5/26/2019 12:52 PM, Joe Doupnik wrote:
>>>>>>      I do queries while indexing, have done so for a long time, without difficulty nor memory usage spikes from dual use. The system has been designed to support that.
>>>>>>      Again, one may look at the numbers using "top" or similar. Try Solr v8.0 and 8.1 to see the difference which I experience here. For reference, the only memory adjustables set in my configuration is in the Solr startup script solr.in.sh saying add "-Xss1024k" in the SOLR_OPTS list and setting SOLR_HEAP="4024m".
>>>>>
>>>>> There is one significant difference between 8.0 and 8.1 in the realm of memory management -- we have switched from the CMS garbage collector to the G1 collector.  So the way that Java manages the heap has changed. This was done because the CMS collector is slated for removal from Java.
>>>>>
>>>>> https://issues.apache.org/jira/browse/SOLR-13394
>>>>>
>>>>> Java is unlike other programs in one respect -- once it allocates heap from the OS, it never gives it back.  This behavior has given Java an undeserved reputation as a memory hog ... but in fact Java's overall memory usage can be very easily limited ... an option that many other programs do NOT have.
>>>>>
>>>>> In your configuration, you set the max heap to a little less than 4GB. You have to expect that it *WILL* use that memory.  By using the SOLR_HEAP variable, you have instructed Solr's startup script to use the same setting for the minimum heap as well as the maximum heap. This is the design intent.
>>>>>
>>>>> If you want to know how much heap is being used, you can't ask the operating system, which means tools like top.  You have to ask Java. And you will have to look at a long-term graph, finding the low points. An instananeous look at Java's heap usage could show you that the whole heap is allocated ... but a significant part of that allocation could be garbage, which becomes available once the garbage is collected.
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
In reply to this post by Joe Doupnik
     A few more numbers to contemplate. An experiment here, adding 80
PDF and PPTX files into an empty index.

Solr v8.0 regular settings, 1.7GB quiesent memory consumption, 1.9GB
while indexing, 2.92 minutes to do the job.
Solr v8.0, using GC_TUNE from v8.1 solr.in.sh, 1.1GB quiesent, 1.3GB
while indexing,  2.97 minutes.
Solr v8.1, regular settings, 4.3GB quiesent, 4.4GB while indexing, 1.67
minutes
Solr v8.1, using GC_TUNE from v8.1 solr.in.sh, 1.0GB quiesent, 1.3GB
while indexing, 1.53 minutes

     It is clear that the GC_TUNE settings from v8.1 are beneficial to
v8.0, saving about 600MB of memory. That's not small change.
     Also clear is that Solr v8.1 is slightly faster than v8.0 when both
use those TUNE values. A hidden benefit.
     Without GC_TUNE settings Solr v8.1 shows its appetite for much
memory, several GB's more than v8.0.

     Because those TUNE settings can make an improvment to Solr v8.0 it
would be beneficial to have the documentation discuss that usage.
Meanwhile, the memory consumption problem remains as discussed.

     On the overfeeding part of things. The classical approach is
pipeline the work and between each stage have a go/stop sign to throttle
traffic (a road crossing lollypop lady, if you like). Such signs could
be set when a regional thread consumption is reached, or similar
resource limit encountered. This permits one stage to stop listening
while the work continues within it and many other stages, and then the
sign changes to go and the regional flow resumes. We see this in common
road/people traffic situations etc every day. It's nicely asynchronous
and does not need a complicated (nor any) master controller. The key is
have limits based on sound engineering criteria, and yes, that might
mean having a few sets of them for different operating situations and
the customer chooses appropriately.
     Thanks,
     Joe D.

On 27/05/2019 11:05, Joe Doupnik wrote:

> You are certainly correct about using external load balancers when
> appropriate. However, a basic problem with servers, that of accepting
> more incoming items than can be handled gracefully is as we know an
> age-old one and solved by back pressure methods (particularly hard
> limits). My experience with Solr suggests that parts (say Tika) are
> being too nice to incoming material, letting too many items enter the
> application, consume resources, and so forth which then become awkward
> to handle (see the locks item discussion cited earlier). Entry ought
> to be blocked until the processing structure declares that resources
> are available to accept new entries (a full but not overfull
> pipeline). Those internal issues, locks, memory and similar, are
> resolvable when limits are imposed. Also, with limits then your
> mentioned load balancers stand a chance of sensing when a particular
> server is currently not accepting new requests. Establishing limits
> does take some creative thinking about how the system as a whole is
> constructed.
>     I brought up the overload case because it pertains to this main
> memory management thread.
>     Thanks,
>     Joe D.
>
> On 27/05/2019 10:21, Bernd Fehling wrote:
>> I think it is not fair blaiming Solr not also having a load balancer.
>> It is up to you and your needs to set up the required infrastucture
>> including load balancing. The are many products available on the market.
>> If your current system can't handle all requests then install more
>> replicas.
>>
>> Regards
>> Bernd
>>
>> Am 27.05.19 um 10:33 schrieb Joe Doupnik:
>>>      While on the topic of resource consumption and locks etc, there
>>> is one other aspect to which Solr has been vulnerable. It is failing
>>> to fend off too many requests at one time. The standard approach is,
>>> of course, named back pressure, such as not replying to a query
>>> until resources permit and thus keeping competion outside of the
>>> application. That limits resource consumption, including locks,
>>> memory and sundry, while permiting normal work within to progress
>>> smoothly. Let the crowds coming to a hit show queue in the rain
>>> outside the theatre until empty seats become available.
>>>
>>> On 27/05/2019 08:52, Joe Doupnik wrote:
>>>> Generalizations tend to fail when confronted with conflicting
>>>> evidence. The simple  evidence is asking how much real memory the
>>>> Solr owned process has been allocated (top, or ps aux or similar)
>>>> and that yields two very different values (the ~1.6GB of Solr v8.0
>>>> and 4.5+GB of Solr v8.1). I have no knowledge of how Java chooses
>>>> to name its usage (heap or otherwise). Prior to v8.1 Solr memory
>>>> consumption varied with activity, thus memory management was
>>>> occuring, memory was borrowed from and returned to the system. What
>>>> might be happening in Solr v8.1 is the new memory management code
>>>> is failing to do a proper job, for reasons which are not visible to
>>>> us in the field, and that failure is important to us.
>>>>     In regard to the referenced lock discussion, it would be a good
>>>> idea to not let the tail wag the dog, tend the common cases and
>>>> live with a few corner case difficulties because perfection is not
>>>> possible.
>>>>     Thanks,
>>>>     Joe D.
>>>>
>>>> On 26/05/2019 20:30, Shawn Heisey wrote:
>>>>> On 5/26/2019 12:52 PM, Joe Doupnik wrote:
>>>>>>      I do queries while indexing, have done so for a long time,
>>>>>> without difficulty nor memory usage spikes from dual use. The
>>>>>> system has been designed to support that.
>>>>>>      Again, one may look at the numbers using "top" or similar.
>>>>>> Try Solr v8.0 and 8.1 to see the difference which I experience
>>>>>> here. For reference, the only memory adjustables set in my
>>>>>> configuration is in the Solr startup script solr.in.sh saying add
>>>>>> "-Xss1024k" in the SOLR_OPTS list and setting SOLR_HEAP="4024m".
>>>>>
>>>>> There is one significant difference between 8.0 and 8.1 in the
>>>>> realm of memory management -- we have switched from the CMS
>>>>> garbage collector to the G1 collector.  So the way that Java
>>>>> manages the heap has changed. This was done because the CMS
>>>>> collector is slated for removal from Java.
>>>>>
>>>>> https://issues.apache.org/jira/browse/SOLR-13394
>>>>>
>>>>> Java is unlike other programs in one respect -- once it allocates
>>>>> heap from the OS, it never gives it back.  This behavior has given
>>>>> Java an undeserved reputation as a memory hog ... but in fact
>>>>> Java's overall memory usage can be very easily limited ... an
>>>>> option that many other programs do NOT have.
>>>>>
>>>>> In your configuration, you set the max heap to a little less than
>>>>> 4GB. You have to expect that it *WILL* use that memory.  By using
>>>>> the SOLR_HEAP variable, you have instructed Solr's startup script
>>>>> to use the same setting for the minimum heap as well as the
>>>>> maximum heap. This is the design intent.
>>>>>
>>>>> If you want to know how much heap is being used, you can't ask the
>>>>> operating system, which means tools like top.  You have to ask
>>>>> Java. And you will have to look at a long-term graph, finding the
>>>>> low points. An instananeous look at Java's heap usage could show
>>>>> you that the whole heap is allocated ... but a significant part of
>>>>> that allocation could be garbage, which becomes available once the
>>>>> garbage is collected.
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Shawn Heisey-2
On 5/27/2019 9:49 AM, Joe Doupnik wrote:

>      A few more numbers to contemplate. An experiment here, adding 80
> PDF and PPTX files into an empty index.
>
> Solr v8.0 regular settings, 1.7GB quiesent memory consumption, 1.9GB
> while indexing, 2.92 minutes to do the job.
> Solr v8.0, using GC_TUNE from v8.1 solr.in.sh, 1.1GB quiesent, 1.3GB
> while indexing,  2.97 minutes.
> Solr v8.1, regular settings, 4.3GB quiesent, 4.4GB while indexing, 1.67
> minutes
> Solr v8.1, using GC_TUNE from v8.1 solr.in.sh, 1.0GB quiesent, 1.3GB
> while indexing, 1.53 minutes
>
>      It is clear that the GC_TUNE settings from v8.1 are beneficial to
> v8.0, saving about 600MB of memory. That's not small change.

GC tuning will not change the amount of memory the program needs.  It
*can't* change it.  All it can do is affect how the garbage collector
works.  Different collectors can result in differences in how much
memory an outside observer will see allocated, because one may be more
aggressive about early collection than the other, but the amount of heap
actually required by the program will not change.

The commented out GC_TUNE settings in the 8.1 "bin/solr.in.sh" file are
the old CMS settings that earlier versions of Solr used.

When you tell a Java program that it is allowed to use 4GB of memory,
it's going to use that memory.  Eventually.  Maybe not in three minutes,
but eventually.  Even the settings that you are seeing use less memory
WILL eventually use all of it that they have been allowed.  That is the
nature of Java.

>      Also clear is that Solr v8.1 is slightly faster than v8.0 when both
> use those TUNE values. A hidden benefit.
>      Without GC_TUNE settings Solr v8.1 shows its appetite for much
> memory, several GB's more than v8.0.

The CMS collector will be removed from Java at some point in the future.
  We can't use it any more.

When you note that for a given sequential process, certain settings
accomplishing that process faster, that's a measure of throughput -- how
much data is pushed through in a given timeframe.  We really don't care
about that metric for Solr.  We care about latency.  Let's say that
setting 1 produces a typical processing time per request of 90
milliseconds, and setting 2 produces a typical processing time per
request of 100 milliseconds.  You might think setting 1 is better.  But
what if 1 percent of the requests with setting 1 take ten seconds, and
EVERY request with setting 2 takes 120 milliseconds or less?  As a
project, we are going to prefer setting 2.  That's not a theoretical
situation -- it's how things really work out with different garbage
collectors, and it's why Solr has the default settings that it does.

Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
     My comments are inserted in-line this time. Thanks for the
amplifications Shawn.

On 27/05/2019 17:39, Shawn Heisey wrote:

> On 5/27/2019 9:49 AM, Joe Doupnik wrote:
>>      A few more numbers to contemplate. An experiment here, adding 80
>> PDF and PPTX files into an empty index.
>>
>> Solr v8.0 regular settings, 1.7GB quiesent memory consumption, 1.9GB
>> while indexing, 2.92 minutes to do the job.
>> Solr v8.0, using GC_TUNE from v8.1 solr.in.sh, 1.1GB quiesent, 1.3GB
>> while indexing,  2.97 minutes.
>> Solr v8.1, regular settings, 4.3GB quiesent, 4.4GB while indexing,
>> 1.67 minutes
>> Solr v8.1, using GC_TUNE from v8.1 solr.in.sh, 1.0GB quiesent, 1.3GB
>> while indexing, 1.53 minutes
>>
>>      It is clear that the GC_TUNE settings from v8.1 are beneficial
>> to v8.0, saving about 600MB of memory. That's not small change.
>
     Well, the numbers observed here tell a slightly different story:
TUNEing can help Solr v8.0. Confirmatory values from other folks would
be good to have. The memory concerned is what is taken from the system
as real memory, and the rest of the system is directly affected by that.
Java can subdivide its part as it wishes.
     Yes, the TUNE values were from Solr v8.1. To me that says those
values are late arriving for v8.0 and prior, but we have them now and
can use them to save system resources. Also, it means that Solr v8.1's
GC1 needs more baking time; the new GC is not quite ready for normal
production work (to put it mildly).

> GC tuning will not change the amount of memory the program needs.  It
> *can't* change it.  All it can do is affect how the garbage collector
> works.  Different collectors can result in differences in how much
> memory an outside observer will see allocated, because one may be more
> aggressive about early collection than the other, but the amount of
> heap actually required by the program will not change.
>
> The commented out GC_TUNE settings in the 8.1 "bin/solr.in.sh" file
> are the old CMS settings that earlier versions of Solr used.
>
> When you tell a Java program that it is allowed to use 4GB of memory,
> it's going to use that memory.  Eventually.  Maybe not in three
> minutes, but eventually.  Even the settings that you are seeing use
> less memory WILL eventually use all of it that they have been
> allowed.  That is the nature of Java.
>
     Data here says there is a quiesent consumption value, a higher one
during intensive indexing, and a smaller one during routine query
handling. The point is the consumption peaks go away, memory is returned
to the system. That's what garbage collection is all about.

>>      Also clear is that Solr v8.1 is slightly faster than v8.0 when
>> both use those TUNE values. A hidden benefit.
>>      Without GC_TUNE settings Solr v8.1 shows its appetite for much
>> memory, several GB's more than v8.0.
>
> The CMS collector will be removed from Java at some point in the
> future.  We can't use it any more.
>
     Meanwhile we in the field can improve our current systems with the
TUNE settings. Solr v8.1 isn't ready yet for that workload, in my opinion.
     The latency discussion below is in need of hard experimental
evidence. That does not mean your analysis is incorrect, but rather we
simply don't know and ought not make decisions based on such
assumptions. I look forward to seeing decent test results.
     Thanks,
     Joe D.

> When you note that for a given sequential process, certain settings
> accomplishing that process faster, that's a measure of throughput --
> how much data is pushed through in a given timeframe.  We really don't
> care about that metric for Solr.  We care about latency.  Let's say
> that setting 1 produces a typical processing time per request of 90
> milliseconds, and setting 2 produces a typical processing time per
> request of 100 milliseconds.  You might think setting 1 is better. 
> But what if 1 percent of the requests with setting 1 take ten seconds,
> and EVERY request with setting 2 takes 120 milliseconds or less?  As a
> project, we are going to prefer setting 2.  That's not a theoretical
> situation -- it's how things really work out with different garbage
> collectors, and it's why Solr has the default settings that it does.
>
> Shawn

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
     An interesting note on the memory returning issue for the G1
collector.
     https://openjdk.java.net/jeps/346
Entitled "JEP 346: Promptly Return Unused Committed Memory from G1"
with a summary saying "Enhance the G1 garbage collector to automatically
return Java heap memory to the operating system when idle."
     It goes on to say the following, and more:

"Motivation

Currently the G1 garbage collector may not return committed Java heap
memory to the operating system in a timely manner. G1 only returns
memory from the Java heap at either a full GC or during a concurrent
cycle. Since G1 tries hard to completely avoid full GCs, and only
triggers a concurrent cycle based on Java heap occupancy and allocation
activity, it will not return Java heap memory in many cases unless
forced to do so externally.

This behavior is particularly disadvantageous in container environments
where resources are paid by use. Even during phases where the VM only
uses a fraction of its assigned memory resources due to inactivity, G1
will retain all of the Java heap. This results in customers paying for
all resources all the time, and cloud providers not being able to fully
utilize their hardware.

If the VM were able to detect phases of Java heap under-utilization
("idle" phases), and automatically reduce its heap usage during that
time, both would benefit.

Shenandoah and OpenJ9's GenCon collector already provide similar
functionality.

Tests with a prototype in Bruno et al., section 5.5, shows that based on
the real-world utilization of a Tomcat server that serves HTTP requests
during the day, and is mostly idle during the night, this solution can
reduce the amount of memory committed by the Java VM by 85%."

     Please read the full web page to have a rounded view of that
discussion.
     Thanks,
     Joe D.

On 27/05/2019 18:17, Joe Doupnik wrote:

>     My comments are inserted in-line this time. Thanks for the
> amplifications Shawn.
>
> On 27/05/2019 17:39, Shawn Heisey wrote:
>> On 5/27/2019 9:49 AM, Joe Doupnik wrote:
>>>      A few more numbers to contemplate. An experiment here, adding
>>> 80 PDF and PPTX files into an empty index.
>>>
>>> Solr v8.0 regular settings, 1.7GB quiesent memory consumption, 1.9GB
>>> while indexing, 2.92 minutes to do the job.
>>> Solr v8.0, using GC_TUNE from v8.1 solr.in.sh, 1.1GB quiesent, 1.3GB
>>> while indexing,  2.97 minutes.
>>> Solr v8.1, regular settings, 4.3GB quiesent, 4.4GB while indexing,
>>> 1.67 minutes
>>> Solr v8.1, using GC_TUNE from v8.1 solr.in.sh, 1.0GB quiesent, 1.3GB
>>> while indexing, 1.53 minutes
>>>
>>>      It is clear that the GC_TUNE settings from v8.1 are beneficial
>>> to v8.0, saving about 600MB of memory. That's not small change.
>>
>     Well, the numbers observed here tell a slightly different story:
> TUNEing can help Solr v8.0. Confirmatory values from other folks would
> be good to have. The memory concerned is what is taken from the system
> as real memory, and the rest of the system is directly affected by
> that. Java can subdivide its part as it wishes.
>     Yes, the TUNE values were from Solr v8.1. To me that says those
> values are late arriving for v8.0 and prior, but we have them now and
> can use them to save system resources. Also, it means that Solr v8.1's
> GC1 needs more baking time; the new GC is not quite ready for normal
> production work (to put it mildly).
>
>> GC tuning will not change the amount of memory the program needs.  It
>> *can't* change it.  All it can do is affect how the garbage collector
>> works.  Different collectors can result in differences in how much
>> memory an outside observer will see allocated, because one may be
>> more aggressive about early collection than the other, but the amount
>> of heap actually required by the program will not change.
>>
>> The commented out GC_TUNE settings in the 8.1 "bin/solr.in.sh" file
>> are the old CMS settings that earlier versions of Solr used.
>>
>> When you tell a Java program that it is allowed to use 4GB of memory,
>> it's going to use that memory.  Eventually.  Maybe not in three
>> minutes, but eventually.  Even the settings that you are seeing use
>> less memory WILL eventually use all of it that they have been
>> allowed.  That is the nature of Java.
>>
>     Data here says there is a quiesent consumption value, a higher one
> during intensive indexing, and a smaller one during routine query
> handling. The point is the consumption peaks go away, memory is
> returned to the system. That's what garbage collection is all about.
>
>>>      Also clear is that Solr v8.1 is slightly faster than v8.0 when
>>> both use those TUNE values. A hidden benefit.
>>>      Without GC_TUNE settings Solr v8.1 shows its appetite for much
>>> memory, several GB's more than v8.0.
>>
>> The CMS collector will be removed from Java at some point in the
>> future.  We can't use it any more.
>>
>     Meanwhile we in the field can improve our current systems with the
> TUNE settings. Solr v8.1 isn't ready yet for that workload, in my
> opinion.
>     The latency discussion below is in need of hard experimental
> evidence. That does not mean your analysis is incorrect, but rather we
> simply don't know and ought not make decisions based on such
> assumptions. I look forward to seeing decent test results.
>     Thanks,
>     Joe D.
>
>> When you note that for a given sequential process, certain settings
>> accomplishing that process faster, that's a measure of throughput --
>> how much data is pushed through in a given timeframe.  We really
>> don't care about that metric for Solr.  We care about latency.  Let's
>> say that setting 1 produces a typical processing time per request of
>> 90 milliseconds, and setting 2 produces a typical processing time per
>> request of 100 milliseconds.  You might think setting 1 is better. 
>> But what if 1 percent of the requests with setting 1 take ten
>> seconds, and EVERY request with setting 2 takes 120 milliseconds or
>> less?  As a project, we are going to prefer setting 2.  That's not a
>> theoretical situation -- it's how things really work out with
>> different garbage collectors, and it's why Solr has the default
>> settings that it does.
>>
>> Shawn
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr-8.1.0 uses much more memory

Joe Doupnik
     An interesting supplement to this discussion. The experiment this
time was use Solr v8.1, omit the GC_TUNE items, but instead adjust
SOLR_HEAP. I had set the heap to 4GB, based on good intentions, and as
we have seen Solr v8.1 gobbles it up and does not return a farthing.
Thus I tried indexing a large (2600 docs) collection of .pdfs, .ppt, etc
files, but with the heap size gradually reduced from 4GB to 1GB. That
worked smoothly, and while indexing Solr is consuming about 1.5/1.6GB
and working hard. So, if a little is good then less must be better, yes?
512MB is too little and Solr barely starts and then shuts down. 1GB
seems to be a safe value for the heap, and no GC_TUNE settings. This is
true on my machines for both Oracle jdk 1.8 and openjdk 10.
     In passing, recommendations on the net suggest watching the action
via jconsole (in the Oracle jdk bundle and in the openjdk material).
Well, it has pretty pictures and many numbers which are far far away
from the basic values we see with top and ps aux | grep solr. Not
useful, even less believable if one asks my simple consumption question.
     So then, this leaves us with the usual question of just how much
heap space does a Java app require. The answer seems to be no one really
knows, only experiments will reveal practical values.
     Thus we choose a heap value tested to be safe and observe the
persisting use of that value until Solr is restarted and then consumes a
smaller amount sufficient for answering queries rather than indexing
files. If the openjdk folks get their reduction work (below) into our
hands then idle memory may shrink further.
     In closing, Solr v8.1 has one very nice advantage over its
predecessors: indexing speed, about double that of v8.0.
     Thanks,
     Joe D.

On 27/05/2019 18:38, Joe Doupnik wrote:

>     An interesting note on the memory returning issue for the G1
> collector.
>     https://openjdk.java.net/jeps/346
> Entitled "JEP 346: Promptly Return Unused Committed Memory from G1"
> with a summary saying "Enhance the G1 garbage collector to
> automatically return Java heap memory to the operating system when idle."
>     It goes on to say the following, and more:
>
> "Motivation
>
> Currently the G1 garbage collector may not return committed Java heap
> memory to the operating system in a timely manner. G1 only returns
> memory from the Java heap at either a full GC or during a concurrent
> cycle. Since G1 tries hard to completely avoid full GCs, and only
> triggers a concurrent cycle based on Java heap occupancy and
> allocation activity, it will not return Java heap memory in many cases
> unless forced to do so externally.
>
> This behavior is particularly disadvantageous in container
> environments where resources are paid by use. Even during phases where
> the VM only uses a fraction of its assigned memory resources due to
> inactivity, G1 will retain all of the Java heap. This results in
> customers paying for all resources all the time, and cloud providers
> not being able to fully utilize their hardware.
>
> If the VM were able to detect phases of Java heap under-utilization
> ("idle" phases), and automatically reduce its heap usage during that
> time, both would benefit.
>
> Shenandoah and OpenJ9's GenCon collector already provide similar
> functionality.
>
> Tests with a prototype in Bruno et al., section 5.5, shows that based
> on the real-world utilization of a Tomcat server that serves HTTP
> requests during the day, and is mostly idle during the night, this
> solution can reduce the amount of memory committed by the Java VM by
> 85%."
>
>     Please read the full web page to have a rounded view of that
> discussion.
>     Thanks,
>     Joe D.
>
> On 27/05/2019 18:17, Joe Doupnik wrote:
>>     My comments are inserted in-line this time. Thanks for the
>> amplifications Shawn.
>>
>> On 27/05/2019 17:39, Shawn Heisey wrote:
>>> On 5/27/2019 9:49 AM, Joe Doupnik wrote:
>>>>      A few more numbers to contemplate. An experiment here, adding
>>>> 80 PDF and PPTX files into an empty index.
>>>>
>>>> Solr v8.0 regular settings, 1.7GB quiesent memory consumption,
>>>> 1.9GB while indexing, 2.92 minutes to do the job.
>>>> Solr v8.0, using GC_TUNE from v8.1 solr.in.sh, 1.1GB quiesent,
>>>> 1.3GB while indexing,  2.97 minutes.
>>>> Solr v8.1, regular settings, 4.3GB quiesent, 4.4GB while indexing,
>>>> 1.67 minutes
>>>> Solr v8.1, using GC_TUNE from v8.1 solr.in.sh, 1.0GB quiesent,
>>>> 1.3GB while indexing, 1.53 minutes
>>>>
>>>>      It is clear that the GC_TUNE settings from v8.1 are beneficial
>>>> to v8.0, saving about 600MB of memory. That's not small change.
>>>
>>     Well, the numbers observed here tell a slightly different story:
>> TUNEing can help Solr v8.0. Confirmatory values from other folks
>> would be good to have. The memory concerned is what is taken from the
>> system as real memory, and the rest of the system is directly
>> affected by that. Java can subdivide its part as it wishes.
>>     Yes, the TUNE values were from Solr v8.1. To me that says those
>> values are late arriving for v8.0 and prior, but we have them now and
>> can use them to save system resources. Also, it means that Solr
>> v8.1's GC1 needs more baking time; the new GC is not quite ready for
>> normal production work (to put it mildly).
>>
>>> GC tuning will not change the amount of memory the program needs. 
>>> It *can't* change it.  All it can do is affect how the garbage
>>> collector works.  Different collectors can result in differences in
>>> how much memory an outside observer will see allocated, because one
>>> may be more aggressive about early collection than the other, but
>>> the amount of heap actually required by the program will not change.
>>>
>>> The commented out GC_TUNE settings in the 8.1 "bin/solr.in.sh" file
>>> are the old CMS settings that earlier versions of Solr used.
>>>
>>> When you tell a Java program that it is allowed to use 4GB of
>>> memory, it's going to use that memory.  Eventually.  Maybe not in
>>> three minutes, but eventually.  Even the settings that you are
>>> seeing use less memory WILL eventually use all of it that they have
>>> been allowed.  That is the nature of Java.
>>>
>>     Data here says there is a quiesent consumption value, a higher
>> one during intensive indexing, and a smaller one during routine query
>> handling. The point is the consumption peaks go away, memory is
>> returned to the system. That's what garbage collection is all about.
>>
>>>>      Also clear is that Solr v8.1 is slightly faster than v8.0 when
>>>> both use those TUNE values. A hidden benefit.
>>>>      Without GC_TUNE settings Solr v8.1 shows its appetite for much
>>>> memory, several GB's more than v8.0.
>>>
>>> The CMS collector will be removed from Java at some point in the
>>> future.  We can't use it any more.
>>>
>>     Meanwhile we in the field can improve our current systems with
>> the TUNE settings. Solr v8.1 isn't ready yet for that workload, in my
>> opinion.
>>     The latency discussion below is in need of hard experimental
>> evidence. That does not mean your analysis is incorrect, but rather
>> we simply don't know and ought not make decisions based on such
>> assumptions. I look forward to seeing decent test results.
>>     Thanks,
>>     Joe D.
>>
>>> When you note that for a given sequential process, certain settings
>>> accomplishing that process faster, that's a measure of throughput --
>>> how much data is pushed through in a given timeframe.  We really
>>> don't care about that metric for Solr.  We care about latency. Let's
>>> say that setting 1 produces a typical processing time per request of
>>> 90 milliseconds, and setting 2 produces a typical processing time
>>> per request of 100 milliseconds.  You might think setting 1 is
>>> better.  But what if 1 percent of the requests with setting 1 take
>>> ten seconds, and EVERY request with setting 2 takes 120 milliseconds
>>> or less?  As a project, we are going to prefer setting 2.  That's
>>> not a theoretical situation -- it's how things really work out with
>>> different garbage collectors, and it's why Solr has the default
>>> settings that it does.
>>>
>>> Shawn
>>
>