heavy reads from disk when off-heap ram is constrained

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

heavy reads from disk when off-heap ram is constrained

lstusr 5u93n4
Hi All,

Something we learned recently that might be useful to the community.

We're running solr in docker, and we've constrained each of our containers
to have access to 10G of the host's ram. Also, through `docker stats`, we
can see the Block IO (filesystem reads/writes) that the solr process is
doing.

On a test system with three nodes, three shards, each with two NRT
replicas, and indexing a reference set of a million documents:

 - When allocating half of the container's available ram to the jvm (i.e.
starting solr with -m 5g) we see a read/write distribution of roughly
400M/2G on each solr node.

 - When allocation ALL of the container's available ram to the jvm (i.e.
starting solr with -m 10g) we see a read/write distribution of around 10G /
2G on each solr node, and the latency on the underlying disk soars.

The takeaway here is that Solr really does need non-jvm RAM to function,
and if you're having performance issues, "adding more ram to the jvm" isn't
always the right way to get things going faster.

Best,

Kyle
Reply | Threaded
Open this post in threaded view
|

RE: heavy reads from disk when off-heap ram is constrained

Markus Jelsma-2
Hello Kyle,

This is actually the manual [1] clearly warns for. Snippet copied from the manual:

"When setting the maximum heap size, be careful not to let the JVM consume all available physical memory. If the JVM process space grows too large, the operating system will start swapping it, which will severely impact performance. In addition, the operating system uses memory space not allocated to processes for file system cache and other purposes. This is especially important for I/O-intensive applications, like Lucene/Solr. The larger your indexes, the more you will benefit from filesystem caching by the OS. It may require some experimentation to determine the optimal tradeoff between heap space for the JVM and memory space for the OS to use."

Please check it out, there are more useful hints to be found there.

Regards,
Markus

[1] https://lucene.apache.org/solr/guide/8_4/jvm-settings.html#JVMSettings-ChoosingMemoryHeapSettings

 
-----Original message-----

> From:lstusr 5u93n4 <[hidden email]>
> Sent: Thursday 27th February 2020 18:45
> To: [hidden email]
> Subject: heavy reads from disk when off-heap ram is constrained
>
> Hi All,
>
> Something we learned recently that might be useful to the community.
>
> We're running solr in docker, and we've constrained each of our containers
> to have access to 10G of the host's ram. Also, through `docker stats`, we
> can see the Block IO (filesystem reads/writes) that the solr process is
> doing.
>
> On a test system with three nodes, three shards, each with two NRT
> replicas, and indexing a reference set of a million documents:
>
>  - When allocating half of the container's available ram to the jvm (i.e.
> starting solr with -m 5g) we see a read/write distribution of roughly
> 400M/2G on each solr node.
>
>  - When allocation ALL of the container's available ram to the jvm (i.e.
> starting solr with -m 10g) we see a read/write distribution of around 10G /
> 2G on each solr node, and the latency on the underlying disk soars.
>
> The takeaway here is that Solr really does need non-jvm RAM to function,
> and if you're having performance issues, "adding more ram to the jvm" isn't
> always the right way to get things going faster.
>
> Best,
>
> Kyle
>