SOLR cache tuning

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

SOLR cache tuning

Tarun Jain
Hi,I have a SOLR installation in master-slave configuration. The slave is used only for reads and master for writes.
I wanted to know if there is anything I can do to improve the performance of the readonly Slave instance?
I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server has 256 GB of RAM with about 50gb free (rest being used by other services on the server)The index is 15gb in size with about 2 million documents.
We do a lot of queries where documents are fetched using filter queries and a few times all 2 million documents are read.My initial idea to speed up SOLR is that given the amount of memory available, SOLR should be able to keep the entire index on the heap (I know OS will also cache the disk blocks) 
My solrconfig has the following:
 <query> <maxBooleanClauses>200000</maxBooleanClauses> <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0" /> <documentCache class="solr.LRUCache" size="8192" initialSize="8192" autowarmCount="0" /> <cache name="perSegFilter" class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" regenerator="solr.NoOpRegenerator" /> <enableLazyFieldLoading>true</enableLazyFieldLoading> <queryResultWindowSize>20</queryResultWindowSize> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <useColdSearcher>false</useColdSearcher> <maxWarmingSearchers>2</maxWarmingSearchers> </query>
I have modified the documentCache size to 8192 from 512 but it has not helped much. 
I know this question has probably been asked a few times and I have read everything I could find out about SOLR cache tuning. I am looking for some more ideas.

Any ideas?
Tarun Jain-=-
Reply | Threaded
Open this post in threaded view
|

Re: SOLR cache tuning

Walter Underwood
Reading all the documents is going to be slow. If you want to do that, use a database.

You do NOT keep all of the index in heap. Solr doesn’t work like that.

Your JVM heap is probably way too big for 2 million documents, but I doubt that is the performance issue. We use an 8 GB heap for all of our Solr instances, including one with about 5 million docs per shard.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Jun 1, 2020, at 8:28 AM, Tarun Jain <[hidden email]> wrote:
>
> Hi,I have a SOLR installation in master-slave configuration. The slave is used only for reads and master for writes.
> I wanted to know if there is anything I can do to improve the performance of the readonly Slave instance?
> I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server has 256 GB of RAM with about 50gb free (rest being used by other services on the server)The index is 15gb in size with about 2 million documents.
> We do a lot of queries where documents are fetched using filter queries and a few times all 2 million documents are read.My initial idea to speed up SOLR is that given the amount of memory available, SOLR should be able to keep the entire index on the heap (I know OS will also cache the disk blocks)
> My solrconfig has the following:
> <query> <maxBooleanClauses>200000</maxBooleanClauses> <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0" /> <documentCache class="solr.LRUCache" size="8192" initialSize="8192" autowarmCount="0" /> <cache name="perSegFilter" class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" regenerator="solr.NoOpRegenerator" /> <enableLazyFieldLoading>true</enableLazyFieldLoading> <queryResultWindowSize>20</queryResultWindowSize> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <useColdSearcher>false</useColdSearcher> <maxWarmingSearchers>2</maxWarmingSearchers> </query>
> I have modified the documentCache size to 8192 from 512 but it has not helped much.
> I know this question has probably been asked a few times and I have read everything I could find out about SOLR cache tuning. I am looking for some more ideas.
>
> Any ideas?
> Tarun Jain-=-

Reply | Threaded
Open this post in threaded view
|

Re: SOLR cache tuning

Jörn Franke
In reply to this post by Tarun Jain
You should not have other processes/container running on the same node. They potentially screw up your os cache making things slow, eg if the other processes also read files etc they can remove things from Solr from the Os cache and then the os cache needs to be filled again.

What performance do you have now and what performance do you expect?

For full queries I would try to export daily all the data and offer it as a simple https download/on a object store. Maybe when you process the documents for indexing you can already put them on a object store or similar - so you don’t need Solr at all to export all of the documents.


See also Walters message.

> Am 01.06.2020 um 17:29 schrieb Tarun Jain <[hidden email]>:
>
> Hi,I have a SOLR installation in master-slave configuration. The slave is used only for reads and master for writes.
> I wanted to know if there is anything I can do to improve the performance of the readonly Slave instance?
> I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server has 256 GB of RAM with about 50gb free (rest being used by other services on the server)The index is 15gb in size with about 2 million documents.
> We do a lot of queries where documents are fetched using filter queries and a few times all 2 million documents are read.My initial idea to speed up SOLR is that given the amount of memory available, SOLR should be able to keep the entire index on the heap (I know OS will also cache the disk blocks)
> My solrconfig has the following:
> <query> <maxBooleanClauses>200000</maxBooleanClauses> <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0" /> <documentCache class="solr.LRUCache" size="8192" initialSize="8192" autowarmCount="0" /> <cache name="perSegFilter" class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" regenerator="solr.NoOpRegenerator" /> <enableLazyFieldLoading>true</enableLazyFieldLoading> <queryResultWindowSize>20</queryResultWindowSize> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <useColdSearcher>false</useColdSearcher> <maxWarmingSearchers>2</maxWarmingSearchers> </query>
> I have modified the documentCache size to 8192 from 512 but it has not helped much.
> I know this question has probably been asked a few times and I have read everything I could find out about SOLR cache tuning. I am looking for some more ideas.
>
> Any ideas?
> Tarun Jain-=-
Reply | Threaded
Open this post in threaded view
|

Re: SOLR cache tuning

Tarun Jain
 Hi,Thanks for the replies so far.
Walter: We have a few more solr cores. So the JVM is sized accordingly. I know we can separate the cores but for easier maintainability we have only one core. Also only one core is being used majority of the times. 
Jorn: I dont have a particular performance number in mind. I am exploring what kind of tuning can be done on a read-only slave on a server with tons of ram.
--------------Earlier today while reading the SOLR documentation I saw that CaffeineCache is the preferred caching implementation. So I switched my solr core to use CaffeineCache and the benchmarking results are very good.The reading times for 1.8 million documents has gone down from 210+ secs to ~130 secs by just using CaffeineCache! So a 40% gain. 
I would recommend switching to CaffeineCache asap as it seems to be a simple change to get a very good speed up. 
I tried various numbers and looks like the default 512 size for filterCache & queryResultCache. The document size in my case is giving slightly better results with size=8192
If anyone else has any other tips on improving performance by changing parameters please let me know.Thanks for the replies so far.
Tarun Jain-=-    On Monday, June 1, 2020, 01:55:56 PM EDT, Jörn Franke <[hidden email]> wrote:  
 
 You should not have other processes/container running on the same node. They potentially screw up your os cache making things slow, eg if the other processes also read files etc they can remove things from Solr from the Os cache and then the os cache needs to be filled again.

What performance do you have now and what performance do you expect?

For full queries I would try to export daily all the data and offer it as a simple https download/on a object store. Maybe when you process the documents for indexing you can already put them on a object store or similar - so you don’t need Solr at all to export all of the documents.


See also Walters message.

> Am 01.06.2020 um 17:29 schrieb Tarun Jain <[hidden email]>:
>
> Hi,I have a SOLR installation in master-slave configuration. The slave is used only for reads and master for writes.
> I wanted to know if there is anything I can do to improve the performance of the readonly Slave instance?
> I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server has 256 GB of RAM with about 50gb free (rest being used by other services on the server)The index is 15gb in size with about 2 million documents.
> We do a lot of queries where documents are fetched using filter queries and a few times all 2 million documents are read.My initial idea to speed up SOLR is that given the amount of memory available, SOLR should be able to keep the entire index on the heap (I know OS will also cache the disk blocks)
> My solrconfig has the following:
> <query> <maxBooleanClauses>200000</maxBooleanClauses> <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0" /> <documentCache class="solr.LRUCache" size="8192" initialSize="8192" autowarmCount="0" /> <cache name="perSegFilter" class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" regenerator="solr.NoOpRegenerator" /> <enableLazyFieldLoading>true</enableLazyFieldLoading> <queryResultWindowSize>20</queryResultWindowSize> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <useColdSearcher>false</useColdSearcher> <maxWarmingSearchers>2</maxWarmingSearchers> </query>
> I have modified the documentCache size to 8192 from 512 but it has not helped much.
> I know this question has probably been asked a few times and I have read everything I could find out about SOLR cache tuning. I am looking for some more ideas.
>
> Any ideas?
> Tarun Jain-=-