Solr dual core performance

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr dual core performance

Shweta Udapudi
Hi All,

I have Solr Server on hardware 60GB RAM split 50GB Solr RAM and the OS. The search index size is 120GB and built offline. There are no updates to this index. I have 2 cores setup, they are completely identical. Except they are on 2 different disk drives.

The test run with the same 3 million queries/core.
The medianRequestTime went from 1.24ms to 25ms. Performance is 20X worse.

Examined
- RAM usage and GC - very few full GC and nothing stands out as unusual
- CPU usage is about 10%
- Disk IO (currently  collecting data)

What other components should I be looking at? Is there any shared resource between 2 disks?
Solr reported statistics below.

Thanks
Shweta

Single Core
CORE1
requests: 3077953
errors: 1580
timeouts: 0
totalTime: 115427000
avgRequestsPerSecond: 840.071468
5minRateReqsPerSecond: 371.1174956
15minRateReqsPerSecond: 871.0010528
avgTimePerRequest: 37.50122241
medianRequestTime: 1.245865
75thPcRequestTime: 1.9472615
95thPcRequestTime: 4.74148915
99thPcRequestTime: 7.27607746
999thPcRequestTime: 465.8678296


Dual Core
CORE1
requests: 3077953
errors: 1580
timeouts: 0
totalTime: 485785273.2
avgRequestsPerSecond: 52.17154443
5minRateReqsPerSecond: 6.17E-53
15minRateReqsPerSecond: 1.21E-16
avgTimePerRequest: 157.827385
medianRequestTime: 25.30935
75thPcRequestTime: 64.17734025
95thPcRequestTime: 209.5386719
99thPcRequestTime: 382.1332734
999thPcRequestTime: 960.208626

CORE2
requests: 3077953
errors: 1580
timeouts: 0
totalTime: 485785273.2
avgRequestsPerSecond: 52.60462395
5minRateReqsPerSecond: 3.11E-52
15minRateReqsPerSecond: 2.07E-16
avgTimePerRequest: 157.827385
medianRequestTime: 25.30935
75thPcRequestTime: 64.17734025
95thPcRequestTime: 209.5386719
99thPcRequestTime: 382.1332734
999thPcRequestTime: 960.208626






Reply | Threaded
Open this post in threaded view
|

RE: Solr dual core performance

Shweta Udapudi
Most important information

solr-spec 5.4.1
solr-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:51:45
lucene-spec 5.4.1
lucene-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:44:59

java version "1.7.0_79"
OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.12.04.1)
OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)


-----Original Message-----
From: Shweta Udapudi
Sent: Tuesday, July 19, 2016 12:08 PM
To: '[hidden email]' <[hidden email]>
Subject: Solr dual core performance

Hi All,

I have Solr Server on hardware 60GB RAM split 50GB Solr RAM and the OS. The search index size is 120GB and built offline. There are no updates to this index. I have 2 cores setup, they are completely identical. Except they are on 2 different disk drives.

The test run with the same 3 million queries/core.
The medianRequestTime went from 1.24ms to 25ms. Performance is 20X worse.

Examined
- RAM usage and GC - very few full GC and nothing stands out as unusual
- CPU usage is about 10%
- Disk IO (currently  collecting data)

What other components should I be looking at? Is there any shared resource between 2 disks?
Solr reported statistics below.

Thanks
Shweta

Single Core
CORE1
requests: 3077953
errors: 1580
timeouts: 0
totalTime: 115427000
avgRequestsPerSecond: 840.071468
5minRateReqsPerSecond: 371.1174956
15minRateReqsPerSecond: 871.0010528
avgTimePerRequest: 37.50122241
medianRequestTime: 1.245865
75thPcRequestTime: 1.9472615
95thPcRequestTime: 4.74148915
99thPcRequestTime: 7.27607746
999thPcRequestTime: 465.8678296


Dual Core
CORE1
requests: 3077953
errors: 1580
timeouts: 0
totalTime: 485785273.2
avgRequestsPerSecond: 52.17154443
5minRateReqsPerSecond: 6.17E-53
15minRateReqsPerSecond: 1.21E-16
avgTimePerRequest: 157.827385
medianRequestTime: 25.30935
75thPcRequestTime: 64.17734025
95thPcRequestTime: 209.5386719
99thPcRequestTime: 382.1332734
999thPcRequestTime: 960.208626

CORE2
requests: 3077953
errors: 1580
timeouts: 0
totalTime: 485785273.2
avgRequestsPerSecond: 52.60462395
5minRateReqsPerSecond: 3.11E-52
15minRateReqsPerSecond: 2.07E-16
avgTimePerRequest: 157.827385
medianRequestTime: 25.30935
75thPcRequestTime: 64.17734025
95thPcRequestTime: 209.5386719
99thPcRequestTime: 382.1332734
999thPcRequestTime: 960.208626






Reply | Threaded
Open this post in threaded view
|

Re: Solr dual core performance

Erick Erickson
I strongly suspect you're not getting "real" searches, but
are hitting your query result cache or perhaps some other
cache. 1.24ms response times are quite unusual.

So check the Solr queryResultCache hit ratio, whether any
fronting HTTP caching is being hit and the like would be
my first step.

Allocating 50G of 60G RAM to Solr is an anti-pattern, see:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Your CPU usage is indicating that Solr isn't doing much at all, also
indicating that you aren't doing "real" searches unless your load
test is sending the queries serially.

And I'm not quite sure what you mean when you say "Dual core".
If you're talking about having two _Solr_ cores on the same machine,
they likely share the disk I/O. Since each response needs to read
data from disk (for stored fields), disk I/O would be a likely place to look.

Best,
Erick

On Tue, Jul 19, 2016 at 12:10 PM, Shweta Udapudi <[hidden email]> wrote:

> Most important information
>
> solr-spec 5.4.1
> solr-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:51:45
> lucene-spec 5.4.1
> lucene-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:44:59
>
> java version "1.7.0_79"
> OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.12.04.1)
> OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)
>
>
> -----Original Message-----
> From: Shweta Udapudi
> Sent: Tuesday, July 19, 2016 12:08 PM
> To: '[hidden email]' <[hidden email]>
> Subject: Solr dual core performance
>
> Hi All,
>
> I have Solr Server on hardware 60GB RAM split 50GB Solr RAM and the OS. The search index size is 120GB and built offline. There are no updates to this index. I have 2 cores setup, they are completely identical. Except they are on 2 different disk drives.
>
> The test run with the same 3 million queries/core.
> The medianRequestTime went from 1.24ms to 25ms. Performance is 20X worse.
>
> Examined
> - RAM usage and GC - very few full GC and nothing stands out as unusual
> - CPU usage is about 10%
> - Disk IO (currently  collecting data)
>
> What other components should I be looking at? Is there any shared resource between 2 disks?
> Solr reported statistics below.
>
> Thanks
> Shweta
>
> Single Core
> CORE1
> requests:                       3077953
> errors:                         1580
> timeouts:                       0
> totalTime:                      115427000
> avgRequestsPerSecond:           840.071468
> 5minRateReqsPerSecond:  371.1174956
> 15minRateReqsPerSecond:         871.0010528
> avgTimePerRequest:              37.50122241
> medianRequestTime:              1.245865
> 75thPcRequestTime:              1.9472615
> 95thPcRequestTime:              4.74148915
> 99thPcRequestTime:              7.27607746
> 999thPcRequestTime:             465.8678296
>
>
> Dual Core
> CORE1
> requests:                       3077953
> errors:                         1580
> timeouts:                       0
> totalTime:                      485785273.2
> avgRequestsPerSecond:           52.17154443
> 5minRateReqsPerSecond:  6.17E-53
> 15minRateReqsPerSecond: 1.21E-16
> avgTimePerRequest:              157.827385
> medianRequestTime:              25.30935
> 75thPcRequestTime:              64.17734025
> 95thPcRequestTime:              209.5386719
> 99thPcRequestTime:              382.1332734
> 999thPcRequestTime:             960.208626
>
> CORE2
> requests:                       3077953
> errors:                         1580
> timeouts:                       0
> totalTime:                      485785273.2
> avgRequestsPerSecond:           52.60462395
> 5minRateReqsPerSecond:  3.11E-52
> 15minRateReqsPerSecond: 2.07E-16
> avgTimePerRequest:              157.827385
> medianRequestTime:              25.30935
> 75thPcRequestTime:              64.17734025
> 95thPcRequestTime:              209.5386719
> 99thPcRequestTime:              382.1332734
> 999thPcRequestTime:             960.208626
>
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr dual core performance

Malcolm Upayavira Holmes
Having the same data as two cores (even on different disks) on the same
instance is, I'd say, pointless.

Basically, Solr makes heavy use of memory to cache the data you have on
disk - whether as in-heap caches, or the OS disk cache, i.e. memory not
allocated to the JVM.

By having two cores the same, you are forcing Solr and the OS to keep
*TWO* copies of your index in memory, halving the efficiency of your
available memory.

Also, note that there's no point assigning something between 32Gb-45Gb
(can't remember the exact numbers) to your JVM, as at that point it will
switch from 32bit addresses to 64bit, and consequently until you reach
the upper bound, you will see no benefit - or worse, you'll see worse
memory behaviour as you'll effectively have less available.

Upayavira

On Tue, 19 Jul 2016, at 08:31 PM, Erick Erickson wrote:

> I strongly suspect you're not getting "real" searches, but
> are hitting your query result cache or perhaps some other
> cache. 1.24ms response times are quite unusual.
>
> So check the Solr queryResultCache hit ratio, whether any
> fronting HTTP caching is being hit and the like would be
> my first step.
>
> Allocating 50G of 60G RAM to Solr is an anti-pattern, see:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> Your CPU usage is indicating that Solr isn't doing much at all, also
> indicating that you aren't doing "real" searches unless your load
> test is sending the queries serially.
>
> And I'm not quite sure what you mean when you say "Dual core".
> If you're talking about having two _Solr_ cores on the same machine,
> they likely share the disk I/O. Since each response needs to read
> data from disk (for stored fields), disk I/O would be a likely place to
> look.
>
> Best,
> Erick
>
> On Tue, Jul 19, 2016 at 12:10 PM, Shweta Udapudi <[hidden email]>
> wrote:
> > Most important information
> >
> > solr-spec 5.4.1
> > solr-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:51:45
> > lucene-spec 5.4.1
> > lucene-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:44:59
> >
> > java version "1.7.0_79"
> > OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.12.04.1)
> > OpenJDK 64-Bit Server VM (build 24.79-b02, mixed mode)
> >
> >
> > -----Original Message-----
> > From: Shweta Udapudi
> > Sent: Tuesday, July 19, 2016 12:08 PM
> > To: '[hidden email]' <[hidden email]>
> > Subject: Solr dual core performance
> >
> > Hi All,
> >
> > I have Solr Server on hardware 60GB RAM split 50GB Solr RAM and the OS. The search index size is 120GB and built offline. There are no updates to this index. I have 2 cores setup, they are completely identical. Except they are on 2 different disk drives.
> >
> > The test run with the same 3 million queries/core.
> > The medianRequestTime went from 1.24ms to 25ms. Performance is 20X worse.
> >
> > Examined
> > - RAM usage and GC - very few full GC and nothing stands out as unusual
> > - CPU usage is about 10%
> > - Disk IO (currently  collecting data)
> >
> > What other components should I be looking at? Is there any shared resource between 2 disks?
> > Solr reported statistics below.
> >
> > Thanks
> > Shweta
> >
> > Single Core
> > CORE1
> > requests:                       3077953
> > errors:                         1580
> > timeouts:                       0
> > totalTime:                      115427000
> > avgRequestsPerSecond:           840.071468
> > 5minRateReqsPerSecond:  371.1174956
> > 15minRateReqsPerSecond:         871.0010528
> > avgTimePerRequest:              37.50122241
> > medianRequestTime:              1.245865
> > 75thPcRequestTime:              1.9472615
> > 95thPcRequestTime:              4.74148915
> > 99thPcRequestTime:              7.27607746
> > 999thPcRequestTime:             465.8678296
> >
> >
> > Dual Core
> > CORE1
> > requests:                       3077953
> > errors:                         1580
> > timeouts:                       0
> > totalTime:                      485785273.2
> > avgRequestsPerSecond:           52.17154443
> > 5minRateReqsPerSecond:  6.17E-53
> > 15minRateReqsPerSecond: 1.21E-16
> > avgTimePerRequest:              157.827385
> > medianRequestTime:              25.30935
> > 75thPcRequestTime:              64.17734025
> > 95thPcRequestTime:              209.5386719
> > 99thPcRequestTime:              382.1332734
> > 999thPcRequestTime:             960.208626
> >
> > CORE2
> > requests:                       3077953
> > errors:                         1580
> > timeouts:                       0
> > totalTime:                      485785273.2
> > avgRequestsPerSecond:           52.60462395
> > 5minRateReqsPerSecond:  3.11E-52
> > 15minRateReqsPerSecond: 2.07E-16
> > avgTimePerRequest:              157.827385
> > medianRequestTime:              25.30935
> > 75thPcRequestTime:              64.17734025
> > 95thPcRequestTime:              209.5386719
> > 99thPcRequestTime:              382.1332734
> > 999thPcRequestTime:             960.208626
> >
> >
> >
> >
> >
> >
Reply | Threaded
Open this post in threaded view
|

RE: Solr dual core performance

Shweta Udapudi
Eric,

My test procedure has been
-  empty OS cache
- reload Solr cores (to empty solr caches)
- execute the 3M queries with users/thread = 50 (using JMeter)
- record Solr reported stats and JMeter stats

Thank you for pointing me in the right direction of CPU not doing much work. That led me to suspect the validity of the test and I performed the 1.24 ms test again. This time solr reported 18ms medianRequestTime. Which is more in line with what I expected to see.

With respect to allocating 50GB of 64GB(its actually 64GB) to solr, this is an equivalent configuration of what we run in production. In production we have 120GBRam with 90GB allocated to solr. I was handed this configuration to test on.

Thanks for directing me to the article. Our current solrconfig.xml has <directoryFactory name="DirectoryFactory" class="solr.NIOFSDirectoryFactory"/>. We definitely need to revisit our JVM settings and solrconfig values to get the best out of solr.

Upayavira,

This effort to install and test 2 identical cores is driven by the operations department. So they can have minimal downtime when they update the index. There will only be one active core at a time in production for now.

As an extension to this effort looking into the future, we would like to understand the impact on performance if both cores are serving requests simultaneously.

"By having two cores the same, you are forcing Solr and the OS to keep
*TWO* copies of your index in memory, halving the efficiency of your available memory."

I am aware of this. The peak RAM usage in production is at 40% of 120GB, so we think there is room to accommodate 2 active cores with minimal impact on response time. I am fairly new to solr and currently believe we need to tune JVM and revisit our solrconfig, not necessarily run two core for better response times. But I need some statistics to make an informed decision, hence the test.

Thank you
Shweta

-----Original Message-----
From: Upayavira [mailto:[hidden email]]
Sent: Wednesday, July 20, 2016 4:25 AM
To: [hidden email]
Subject: Re: Solr dual core performance

Having the same data as two cores (even on different disks) on the same instance is, I'd say, pointless.

Basically, Solr makes heavy use of memory to cache the data you have on disk - whether as in-heap caches, or the OS disk cache, i.e. memory not allocated to the JVM.

By having two cores the same, you are forcing Solr and the OS to keep
*TWO* copies of your index in memory, halving the efficiency of your available memory.

Also, note that there's no point assigning something between 32Gb-45Gb (can't remember the exact numbers) to your JVM, as at that point it will switch from 32bit addresses to 64bit, and consequently until you reach the upper bound, you will see no benefit - or worse, you'll see worse memory behaviour as you'll effectively have less available.

Upayavira

On Tue, 19 Jul 2016, at 08:31 PM, Erick Erickson wrote:

> I strongly suspect you're not getting "real" searches, but are hitting
> your query result cache or perhaps some other cache. 1.24ms response
> times are quite unusual.
>
> So check the Solr queryResultCache hit ratio, whether any fronting
> HTTP caching is being hit and the like would be my first step.
>
> Allocating 50G of 60G RAM to Solr is an anti-pattern, see:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
> l
>
> Your CPU usage is indicating that Solr isn't doing much at all, also
> indicating that you aren't doing "real" searches unless your load test
> is sending the queries serially.
>
> And I'm not quite sure what you mean when you say "Dual core".
> If you're talking about having two _Solr_ cores on the same machine,
> they likely share the disk I/O. Since each response needs to read data
> from disk (for stored fields), disk I/O would be a likely place to
> look.
>
> Best,
> Erick
>
> On Tue, Jul 19, 2016 at 12:10 PM, Shweta Udapudi
> <[hidden email]>
> wrote:
> > Most important information
> >
> > solr-spec 5.4.1
> > solr-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:51:45 lucene-spec
> > 5.4.1 lucene-impl 5.4.1 1725212 - jpountz - 2016-01-18 11:44:59
> >
> > java version "1.7.0_79"
> > OpenJDK Runtime Environment (IcedTea 2.5.6)
> > (7u79-2.5.6-0ubuntu1.12.04.1) OpenJDK 64-Bit Server VM (build
> > 24.79-b02, mixed mode)
> >
> >
> > -----Original Message-----
> > From: Shweta Udapudi
> > Sent: Tuesday, July 19, 2016 12:08 PM
> > To: '[hidden email]' <[hidden email]>
> > Subject: Solr dual core performance
> >
> > Hi All,
> >
> > I have Solr Server on hardware 60GB RAM split 50GB Solr RAM and the OS. The search index size is 120GB and built offline. There are no updates to this index. I have 2 cores setup, they are completely identical. Except they are on 2 different disk drives.
> >
> > The test run with the same 3 million queries/core.
> > The medianRequestTime went from 1.24ms to 25ms. Performance is 20X worse.
> >
> > Examined
> > - RAM usage and GC - very few full GC and nothing stands out as
> > unusual
> > - CPU usage is about 10%
> > - Disk IO (currently  collecting data)
> >
> > What other components should I be looking at? Is there any shared resource between 2 disks?
> > Solr reported statistics below.
> >
> > Thanks
> > Shweta
> >
> > Single Core
> > CORE1
> > requests:                       3077953
> > errors:                         1580
> > timeouts:                       0
> > totalTime:                      115427000
> > avgRequestsPerSecond:           840.071468
> > 5minRateReqsPerSecond:  371.1174956
> > 15minRateReqsPerSecond:         871.0010528
> > avgTimePerRequest:              37.50122241
> > medianRequestTime:              1.245865
> > 75thPcRequestTime:              1.9472615
> > 95thPcRequestTime:              4.74148915
> > 99thPcRequestTime:              7.27607746
> > 999thPcRequestTime:             465.8678296
> >
> >
> > Dual Core
> > CORE1
> > requests:                       3077953
> > errors:                         1580
> > timeouts:                       0
> > totalTime:                      485785273.2
> > avgRequestsPerSecond:           52.17154443
> > 5minRateReqsPerSecond:  6.17E-53
> > 15minRateReqsPerSecond: 1.21E-16
> > avgTimePerRequest:              157.827385
> > medianRequestTime:              25.30935
> > 75thPcRequestTime:              64.17734025
> > 95thPcRequestTime:              209.5386719
> > 99thPcRequestTime:              382.1332734
> > 999thPcRequestTime:             960.208626
> >
> > CORE2
> > requests:                       3077953
> > errors:                         1580
> > timeouts:                       0
> > totalTime:                      485785273.2
> > avgRequestsPerSecond:           52.60462395
> > 5minRateReqsPerSecond:  3.11E-52
> > 15minRateReqsPerSecond: 2.07E-16
> > avgTimePerRequest:              157.827385
> > medianRequestTime:              25.30935
> > 75thPcRequestTime:              64.17734025
> > 95thPcRequestTime:              209.5386719
> > 99thPcRequestTime:              382.1332734
> > 999thPcRequestTime:             960.208626
> >
> >
> >
> >
> >
> >