SolrCloud scaling/optimization for high request rate

classic Classic list List threaded Threaded
42 messages Options
123
Reply | Threaded
Open this post in threaded view
|

SolrCloud scaling/optimization for high request rate

Sofiya Strochyk

Hi everyone, 

We have a SolrCloud setup with the following configuration:

  • 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon E5-1650v2, 12 cores, with SSDs) 
  • One collection, 4 shards, each has only a single replica (so 4 replicas in total), using compositeId router
  • Total index size is about 150M documents/320GB, so about 40M/80GB per node
  • Zookeeper is on a separate server
  • Documents consist of about 20 fields (most of them are both stored and indexed), average document size is about 2kB
  • Queries are mostly 2-3 words in the q field, with 2 fq parameters, with complex sort expression (containing IF functions)
  • We don't use faceting due to performance reasons but need to add it in the future
  • Majority of the documents are reindexed 2 times/day, as fast as the SOLR allows, in batches of 1000-10000 docs. Some of the documents are also deleted (by id, not by query)
  • autoCommit is set to maxTime of 1 minute with openSearcher=false and autoSoftCommit maxTime is 30 minutes with openSearcher=true. Commits from clients are ignored.
  • Heap size is set to 8GB.

Target query rate is up to 500 qps, maybe 300, and we need to keep response time at <200ms. But at the moment we only see very good search performance with up to 100 requests per second. Whenever it grows to about 200, average response time abruptly increases to 0.5-1 second. (Also it seems that request rate reported by SOLR in admin metrics is 2x higher than the real one, because for every query, every shard receives 2 requests: one to obtain IDs and second one to get data by IDs; so target rate for SOLR metrics would be 1000 qps).

During high request load, CPU usage increases dramatically on the SOLR nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and about 93% on 1 server (random server each time, not the smallest one).

The documentation mentions replication to spread the load between the servers. We tested replicating to smaller servers (32GB RAM, Intel Core i7-4770). However, when we tested it, the replicas were going out of sync all the time (possibly during commits) and reported errors like "PeerSync Recovery was not successful - trying replication." Then they proceed with replication which takes hours and the leader handles all requests singlehandedly during that time. Also both leaders and replicas started encountering OOM errors (heap space) for unknown reason. Heap dump analysis shows that most of the memory is consumed by [J (array of long) type, my best guess would be that it is "_version_" field, but it's still unclear why it happens. Also, even though with replication request rate and CPU usage drop 2 times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers (p75_ms is much smaller on nodes with replication, but still not as low as under load of <100 requests/s).

Garbage collection is much more active during high load as well. Full GC happens almost exclusively during those times. We have tried tuning GC options like suggested here and it didn't change things though.

My questions are

  • How do we increase throughput? Is replication the only solution?
  • if yes - then why doesn't it affect response times, considering that CPU is not 100% used and index fits into memory?
  • How to deal with OOM and replicas going into recovery?
  • Is memory or CPU the main problem? (When searching on the internet, i never see CPU as main bottleneck for SOLR, but our case might be different)
  • Or do we need smaller shards? Could segments merging be a problem?
  • How to add faceting without search queries slowing down too much?
  • How to diagnose these problems and narrow down to the real reason in hardware or setup?
Any help would be much appreciated.

Thanks!

--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Erick Erickson
Some ideas:

1> What version of Solr? Solr 7.3 completely re-wrote Leader Initiated
Recovery and 7.5 has other improvements for recovery, we're hoping
that the recovery situation is much improved.

2> In the 7x code line, there are TLOG and PULL replicas. As of 7.5,
you can set up so the queries are served by replica type, see:
https://issues.apache.org/jira/browse/SOLR-11982. This might help you
out. This moves all the indexing to the leader and reserves the rest
of the nodes for queries only, using old-style replication. I'm
assuming from your commit rate that latency between when updates
happen and the updates are searchable isn't a big concern.

3> Just because the CPU isn't 100% doesn't mean Solr is running flat
out. There's I/O waits while sub-requests are serviced and the like.

4> As for how to add faceting without slowing down querying, there's
no way. Extra work is extra work. Depending on _what_ you're faceting
on, you may be able to do some tricks, but without details it's hard
to say. You need to get the query rate target first though ;)

5> OOMs Hmm, you say you're doing complex sorts, are all fields
involved in sorts docValues=true? They have to be to be used in
function queries of course, but what about any fields that aren't?
What about your _version_ field?

6>  bq. "...indexed 2 times/day, as fast as the SOLR allows..." One
experiment I'd run is to test your QPS rate when there was _no_
indexing going on. That would give you a hint as to whether the
TLOG/PULL configuration would be helpful. There's been talk of
separate thread pools for indexing and querying to give queries a
better shot at the CPU, but that's not in place yet.

7> G1GC may also help rather than CMS, but as you're well aware GC
tuning "is more art than science" ;).

Good luck!
Erick

On Fri, Oct 26, 2018 at 8:55 AM Sofiya Strochyk <[hidden email]> wrote:

>
> Hi everyone,
>
> We have a SolrCloud setup with the following configuration:
>
> 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon E5-1650v2, 12 cores, with SSDs)
> One collection, 4 shards, each has only a single replica (so 4 replicas in total), using compositeId router
> Total index size is about 150M documents/320GB, so about 40M/80GB per node
> Zookeeper is on a separate server
> Documents consist of about 20 fields (most of them are both stored and indexed), average document size is about 2kB
> Queries are mostly 2-3 words in the q field, with 2 fq parameters, with complex sort expression (containing IF functions)
> We don't use faceting due to performance reasons but need to add it in the future
> Majority of the documents are reindexed 2 times/day, as fast as the SOLR allows, in batches of 1000-10000 docs. Some of the documents are also deleted (by id, not by query)
> autoCommit is set to maxTime of 1 minute with openSearcher=false and autoSoftCommit maxTime is 30 minutes with openSearcher=true. Commits from clients are ignored.
> Heap size is set to 8GB.
>
> Target query rate is up to 500 qps, maybe 300, and we need to keep response time at <200ms. But at the moment we only see very good search performance with up to 100 requests per second. Whenever it grows to about 200, average response time abruptly increases to 0.5-1 second. (Also it seems that request rate reported by SOLR in admin metrics is 2x higher than the real one, because for every query, every shard receives 2 requests: one to obtain IDs and second one to get data by IDs; so target rate for SOLR metrics would be 1000 qps).
>
> During high request load, CPU usage increases dramatically on the SOLR nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and about 93% on 1 server (random server each time, not the smallest one).
>
> The documentation mentions replication to spread the load between the servers. We tested replicating to smaller servers (32GB RAM, Intel Core i7-4770). However, when we tested it, the replicas were going out of sync all the time (possibly during commits) and reported errors like "PeerSync Recovery was not successful - trying replication." Then they proceed with replication which takes hours and the leader handles all requests singlehandedly during that time. Also both leaders and replicas started encountering OOM errors (heap space) for unknown reason. Heap dump analysis shows that most of the memory is consumed by [J (array of long) type, my best guess would be that it is "_version_" field, but it's still unclear why it happens. Also, even though with replication request rate and CPU usage drop 2 times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers (p75_ms is much smaller on nodes with replication, but still not as low as under load of <100 requests/s).
>
> Garbage collection is much more active during high load as well. Full GC happens almost exclusively during those times. We have tried tuning GC options like suggested here and it didn't change things though.
>
> My questions are
>
> How do we increase throughput? Is replication the only solution?
> if yes - then why doesn't it affect response times, considering that CPU is not 100% used and index fits into memory?
> How to deal with OOM and replicas going into recovery?
> Is memory or CPU the main problem? (When searching on the internet, i never see CPU as main bottleneck for SOLR, but our case might be different)
> Or do we need smaller shards? Could segments merging be a problem?
> How to add faceting without search queries slowing down too much?
> How to diagnose these problems and narrow down to the real reason in hardware or setup?
>
> Any help would be much appreciated.
>
> Thanks!
>
> --
> Sofiia Strochyk
>
>
>
> [hidden email]
>
> www.interlogic.com.ua
>
>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk

Thanks Erick,

1. We already use Solr 7.5, upgraded some of our nodes only recently to see if this eliminates the difference in performance (it doesn't, but I'll test and see if the situation with replicas syncing/recovery has improved since then)

2. Yes, we only open searcher once every 30 minutes so it is not an NRT case. But it is only recommended to use NRT/TLOG/TLOG+PULL replica types together (currently we have all NRT replicas), would you suggest we change leaders to TLOG and slaves to PULL? And this would also eliminate the redundancy provided by replication because PULL replicas can't become leaders, right?

3. Yes but then it would be reflected in iowait metric, which is almost always near zero on our servers. Is there anything else Solr could be waiting for, and is there a way to check it? If we are going to need even more servers for faster response and faceting then there must be a way to know which resource we should get more of.

5. Yes, docValues are enabled for the fields we sort on (except score which is an internal field); _version_ is left at default i think (type="long" indexed="false" stored="false", and it's also marked as having DocValues in the admin UI)

6. QPS and response time seem to be about the same with and without indexing; server load also looks about the same so i assume indexing doesn't take up a lot of resources (a little strange, but possible if it is limited by network or some other things from point 3).

7. Will try using G1 if nothing else helps... Haven't tested it yet because it is considered unsafe and i'd like to have all other options exhausted first. (And even then it is probably going to be a minor improvement? How much more efficient could it possibly be?)


On 26.10.18 19:18, Erick Erickson wrote:
Some ideas:

1> What version of Solr? Solr 7.3 completely re-wrote Leader Initiated
Recovery and 7.5 has other improvements for recovery, we're hoping
that the recovery situation is much improved.

2> In the 7x code line, there are TLOG and PULL replicas. As of 7.5,
you can set up so the queries are served by replica type, see:
https://issues.apache.org/jira/browse/SOLR-11982. This might help you
out. This moves all the indexing to the leader and reserves the rest
of the nodes for queries only, using old-style replication. I'm
assuming from your commit rate that latency between when updates
happen and the updates are searchable isn't a big concern.

3> Just because the CPU isn't 100% doesn't mean Solr is running flat
out. There's I/O waits while sub-requests are serviced and the like.

4> As for how to add faceting without slowing down querying, there's
no way. Extra work is extra work. Depending on _what_ you're faceting
on, you may be able to do some tricks, but without details it's hard
to say. You need to get the query rate target first though ;)

5> OOMs Hmm, you say you're doing complex sorts, are all fields
involved in sorts docValues=true? They have to be to be used in
function queries of course, but what about any fields that aren't?
What about your _version_ field?

6>  bq. "...indexed 2 times/day, as fast as the SOLR allows..." One
experiment I'd run is to test your QPS rate when there was _no_
indexing going on. That would give you a hint as to whether the
TLOG/PULL configuration would be helpful. There's been talk of
separate thread pools for indexing and querying to give queries a
better shot at the CPU, but that's not in place yet.

7> G1GC may also help rather than CMS, but as you're well aware GC
tuning "is more art than science" ;).

Good luck!
Erick

On Fri, Oct 26, 2018 at 8:55 AM Sofiya Strochyk [hidden email] wrote:
Hi everyone,

We have a SolrCloud setup with the following configuration:

4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon E5-1650v2, 12 cores, with SSDs)
One collection, 4 shards, each has only a single replica (so 4 replicas in total), using compositeId router
Total index size is about 150M documents/320GB, so about 40M/80GB per node
Zookeeper is on a separate server
Documents consist of about 20 fields (most of them are both stored and indexed), average document size is about 2kB
Queries are mostly 2-3 words in the q field, with 2 fq parameters, with complex sort expression (containing IF functions)
We don't use faceting due to performance reasons but need to add it in the future
Majority of the documents are reindexed 2 times/day, as fast as the SOLR allows, in batches of 1000-10000 docs. Some of the documents are also deleted (by id, not by query)
autoCommit is set to maxTime of 1 minute with openSearcher=false and autoSoftCommit maxTime is 30 minutes with openSearcher=true. Commits from clients are ignored.
Heap size is set to 8GB.

Target query rate is up to 500 qps, maybe 300, and we need to keep response time at <200ms. But at the moment we only see very good search performance with up to 100 requests per second. Whenever it grows to about 200, average response time abruptly increases to 0.5-1 second. (Also it seems that request rate reported by SOLR in admin metrics is 2x higher than the real one, because for every query, every shard receives 2 requests: one to obtain IDs and second one to get data by IDs; so target rate for SOLR metrics would be 1000 qps).

During high request load, CPU usage increases dramatically on the SOLR nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and about 93% on 1 server (random server each time, not the smallest one).

The documentation mentions replication to spread the load between the servers. We tested replicating to smaller servers (32GB RAM, Intel Core i7-4770). However, when we tested it, the replicas were going out of sync all the time (possibly during commits) and reported errors like "PeerSync Recovery was not successful - trying replication." Then they proceed with replication which takes hours and the leader handles all requests singlehandedly during that time. Also both leaders and replicas started encountering OOM errors (heap space) for unknown reason. Heap dump analysis shows that most of the memory is consumed by [J (array of long) type, my best guess would be that it is "_version_" field, but it's still unclear why it happens. Also, even though with replication request rate and CPU usage drop 2 times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers (p75_ms is much smaller on nodes with replication, but still not as low as under load of <100 requests/s).

Garbage collection is much more active during high load as well. Full GC happens almost exclusively during those times. We have tried tuning GC options like suggested here and it didn't change things though.

My questions are

How do we increase throughput? Is replication the only solution?
if yes - then why doesn't it affect response times, considering that CPU is not 100% used and index fits into memory?
How to deal with OOM and replicas going into recovery?
Is memory or CPU the main problem? (When searching on the internet, i never see CPU as main bottleneck for SOLR, but our case might be different)
Or do we need smaller shards? Could segments merging be a problem?
How to add faceting without search queries slowing down too much?
How to diagnose these problems and narrow down to the real reason in hardware or setup?

Any help would be much appreciated.

Thanks!

--
Sofiia Strochyk



[hidden email]

www.interlogic.com.ua



--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Walter Underwood
The G1 collector should improve 95th percentile performance, because it limits the length of pauses.

With the CMS/ParNew collector, I ran very large Eden spaces, 2 Gb out of an 8 Gb heap. Nearly all of the allocations in Solr have the lifetime of one request, so you don’t want any of those allocations to be promoted to tenured space. Tenured space should be mostly cache evictions and should grow slowly.

For our clusters, when we hit 70% CPU, we add more CPUs. If we drive Solr much harder than that, it goes into congestion collapse. That is totally expected. When you use all of a resource, things get slow. Request more than all of a resource and things get very, very slow.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Oct 26, 2018, at 10:21 AM, Sofiya Strochyk <[hidden email]> wrote:
>
> Thanks Erick,
>
> 1. We already use Solr 7.5, upgraded some of our nodes only recently to see if this eliminates the difference in performance (it doesn't, but I'll test and see if the situation with replicas syncing/recovery has improved since then)
> 2. Yes, we only open searcher once every 30 minutes so it is not an NRT case. But it is only recommended <https://lucene.apache.org/solr/guide/7_1/shards-and-indexing-data-in-solrcloud.html#combining-replica-types-in-a-cluster> to use NRT/TLOG/TLOG+PULL replica types together (currently we have all NRT replicas), would you suggest we change leaders to TLOG and slaves to PULL? And this would also eliminate the redundancy provided by replication because PULL replicas can't become leaders, right?
> 3. Yes but then it would be reflected in iowait metric, which is almost always near zero on our servers. Is there anything else Solr could be waiting for, and is there a way to check it? If we are going to need even more servers for faster response and faceting then there must be a way to know which resource we should get more of.
> 5. Yes, docValues are enabled for the fields we sort on (except score which is an internal field); _version_ is left at default i think (type="long" indexed="false" stored="false", and it's also marked as having DocValues in the admin UI)
> 6. QPS and response time seem to be about the same with and without indexing; server load also looks about the same so i assume indexing doesn't take up a lot of resources (a little strange, but possible if it is limited by network or some other things from point 3).
>
> 7. Will try using G1 if nothing else helps... Haven't tested it yet because it is considered unsafe and i'd like to have all other options exhausted first. (And even then it is probably going to be a minor improvement? How much more efficient could it possibly be?)
>
> On 26.10.18 19:18, Erick Erickson wrote:
>> Some ideas:
>>
>> 1> What version of Solr? Solr 7.3 completely re-wrote Leader Initiated
>> Recovery and 7.5 has other improvements for recovery, we're hoping
>> that the recovery situation is much improved.
>>
>> 2> In the 7x code line, there are TLOG and PULL replicas. As of 7.5,
>> you can set up so the queries are served by replica type, see:
>> https://issues.apache.org/jira/browse/SOLR-11982 <https://issues.apache.org/jira/browse/SOLR-11982>. This might help you
>> out. This moves all the indexing to the leader and reserves the rest
>> of the nodes for queries only, using old-style replication. I'm
>> assuming from your commit rate that latency between when updates
>> happen and the updates are searchable isn't a big concern.
>>
>> 3> Just because the CPU isn't 100% doesn't mean Solr is running flat
>> out. There's I/O waits while sub-requests are serviced and the like.
>>
>> 4> As for how to add faceting without slowing down querying, there's
>> no way. Extra work is extra work. Depending on _what_ you're faceting
>> on, you may be able to do some tricks, but without details it's hard
>> to say. You need to get the query rate target first though ;)
>>
>> 5> OOMs Hmm, you say you're doing complex sorts, are all fields
>> involved in sorts docValues=true? They have to be to be used in
>> function queries of course, but what about any fields that aren't?
>> What about your _version_ field?
>>
>> 6>  bq. "...indexed 2 times/day, as fast as the SOLR allows..." One
>> experiment I'd run is to test your QPS rate when there was _no_
>> indexing going on. That would give you a hint as to whether the
>> TLOG/PULL configuration would be helpful. There's been talk of
>> separate thread pools for indexing and querying to give queries a
>> better shot at the CPU, but that's not in place yet.
>>
>> 7> G1GC may also help rather than CMS, but as you're well aware GC
>> tuning "is more art than science" ;).
>>
>> Good luck!
>> Erick
>>
>> On Fri, Oct 26, 2018 at 8:55 AM Sofiya Strochyk <[hidden email]> <mailto:[hidden email]> wrote:
>>> Hi everyone,
>>>
>>> We have a SolrCloud setup with the following configuration:
>>>
>>> 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon E5-1650v2, 12 cores, with SSDs)
>>> One collection, 4 shards, each has only a single replica (so 4 replicas in total), using compositeId router
>>> Total index size is about 150M documents/320GB, so about 40M/80GB per node
>>> Zookeeper is on a separate server
>>> Documents consist of about 20 fields (most of them are both stored and indexed), average document size is about 2kB
>>> Queries are mostly 2-3 words in the q field, with 2 fq parameters, with complex sort expression (containing IF functions)
>>> We don't use faceting due to performance reasons but need to add it in the future
>>> Majority of the documents are reindexed 2 times/day, as fast as the SOLR allows, in batches of 1000-10000 docs. Some of the documents are also deleted (by id, not by query)
>>> autoCommit is set to maxTime of 1 minute with openSearcher=false and autoSoftCommit maxTime is 30 minutes with openSearcher=true. Commits from clients are ignored.
>>> Heap size is set to 8GB.
>>>
>>> Target query rate is up to 500 qps, maybe 300, and we need to keep response time at <200ms. But at the moment we only see very good search performance with up to 100 requests per second. Whenever it grows to about 200, average response time abruptly increases to 0.5-1 second. (Also it seems that request rate reported by SOLR in admin metrics is 2x higher than the real one, because for every query, every shard receives 2 requests: one to obtain IDs and second one to get data by IDs; so target rate for SOLR metrics would be 1000 qps).
>>>
>>> During high request load, CPU usage increases dramatically on the SOLR nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and about 93% on 1 server (random server each time, not the smallest one).
>>>
>>> The documentation mentions replication to spread the load between the servers. We tested replicating to smaller servers (32GB RAM, Intel Core i7-4770). However, when we tested it, the replicas were going out of sync all the time (possibly during commits) and reported errors like "PeerSync Recovery was not successful - trying replication." Then they proceed with replication which takes hours and the leader handles all requests singlehandedly during that time. Also both leaders and replicas started encountering OOM errors (heap space) for unknown reason. Heap dump analysis shows that most of the memory is consumed by [J (array of long) type, my best guess would be that it is "_version_" field, but it's still unclear why it happens. Also, even though with replication request rate and CPU usage drop 2 times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers (p75_ms is much smaller on nodes with replication, but still not as low as under load of <100 requests/s).
>>>
>>> Garbage collection is much more active during high load as well. Full GC happens almost exclusively during those times. We have tried tuning GC options like suggested here and it didn't change things though.
>>>
>>> My questions are
>>>
>>> How do we increase throughput? Is replication the only solution?
>>> if yes - then why doesn't it affect response times, considering that CPU is not 100% used and index fits into memory?
>>> How to deal with OOM and replicas going into recovery?
>>> Is memory or CPU the main problem? (When searching on the internet, i never see CPU as main bottleneck for SOLR, but our case might be different)
>>> Or do we need smaller shards? Could segments merging be a problem?
>>> How to add faceting without search queries slowing down too much?
>>> How to diagnose these problems and narrow down to the real reason in hardware or setup?
>>>
>>> Any help would be much appreciated.
>>>
>>> Thanks!
>>>
>>> --
>>> Sofiia Strochyk
>>>
>>>
>>>
>>> [hidden email] <mailto:[hidden email]>
>>>
>>> www.interlogic.com.ua <http://www.interlogic.com.ua/>
>>>
>>>
>
> --
> Sofiia Strochyk
>
>
>
> [hidden email] <mailto:[hidden email]>
> <menlkogfmjmjjahp.png>
> www.interlogic.com.ua <https://www.interlogic.com.ua/>
>
> <ghlnacnjngimaejn.png> <https://www.facebook.com/InterLogicOfficial>   <kjnfknfjjoeoajce.png> <https://www.linkedin.com/company/interlogic>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Toke Eskildsen-2
In reply to this post by Sofiya Strochyk
Sofiya Strochyk <[hidden email]> wrote:
> 5. Yes, docValues are enabled for the fields we sort on
> (except score which is an internal field); [...]

I am currently working on
https://issues.apache.org/jira/browse/LUCENE-8374
which speeds up DocValues-operations for indexes with many documents.

What "many" means is hard to say, but as can be seen in the JIRA, Tim Underwood sees a nice speed up for faceting with his 80M doc index. Hopefully it can also benefit your 40M doc (per shard) index with sorting on (I infer) multiple DocValued fields. I'd be happy to assist, should you need help with the patch.

- Toke Eskildsen
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

David Hastings
Would adding the docValues in the schema, but not reindexing, cause
errors?  IE, only apply the doc values after the next reindex, but in the
meantime keep functioning as there were none until then?

On Fri, Oct 26, 2018 at 2:15 PM Toke Eskildsen <[hidden email]> wrote:

> Sofiya Strochyk <[hidden email]> wrote:
> > 5. Yes, docValues are enabled for the fields we sort on
> > (except score which is an internal field); [...]
>
> I am currently working on
> https://issues.apache.org/jira/browse/LUCENE-8374
> which speeds up DocValues-operations for indexes with many documents.
>
> What "many" means is hard to say, but as can be seen in the JIRA, Tim
> Underwood sees a nice speed up for faceting with his 80M doc index.
> Hopefully it can also benefit your 40M doc (per shard) index with sorting
> on (I infer) multiple DocValued fields. I'd be happy to assist, should you
> need help with the patch.
>
> - Toke Eskildsen
>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Toke Eskildsen-2
David Hastings <[hidden email]> wrote:
> Would adding the docValues in the schema, but not reindexing, cause
> errors?  IE, only apply the doc values after the next reindex, but in the
> meantime keep functioning as there were none until then?

As soon as you specify in the schema that a field has docValues=true, Solr treats all existing documents as having docValues enabled for that field. As there is no docValue content, DocValues-aware functionality such as sorting and faceting will not work for that field, until the documents has been re-indexed.

- Toke Eskildsen
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Erick Erickson
Sofiya:

I haven't said so before, but it's a great pleasure to work with
someone who's done a lot of homework before pinging the list. The only
unfortunate bit is that it usually means the simple "Oh, I can fix
that without thinking about it much" doesn't work ;)

2.  I'll clarify a bit here. Any TLOG replica can become the leader.
Here's the process for an update:
> doc comes in to the leader (may be TLOG)
> doc is forwarded to all TLOG replicas, _but it is not indexed there_.
> If the leader fails, the other TLOG replicas have enough documents in _their_ tlogs to "catch up" and one is elected
> You're totally right that PULL replicas cannot become leaders
> having all TLOG replicas means that the CPU cycles otherwise consumed by indexing are available for query processing.

The point here is that TLOG replicas don't need to expend CPU cycles
to index documents, freeing up all those cycles for serving queries.

Now, that said you report that QPS rate doesn't particularly seem to
be affected by whether you're indexing or not, so that makes using
TLOG and PULL replicas less likely to solve your problem. I was
thinking about your statement that you index as fast as possible....


6. This is a little surprising. Here's my guess: You're  indexing in
large batches and the batch is only really occupying a thread or two
so it's effectively serialized thus not consuming a huge amount of
resources.

So unless G1 really solves a lot of problems, more replicas are
indicated. On machines with large amounts of RAM and lots of CPUs, one
other option is to run multiple JVMs per physical node that's
sometimes helpful.

One other possibility. In Solr 7.5, you have a ton of metrics
available. If you hit the admin/metrics end point you'll see 150-200
available metrics. Apart from running  a profiler to see what's
consuming the most cycles, the metrics can give you a view into what
Solr is doing and may help you pinpoint what's using the most cycles.

Best,
Erick
On Fri, Oct 26, 2018 at 12:23 PM Toke Eskildsen <[hidden email]> wrote:
>
> David Hastings <[hidden email]> wrote:
> > Would adding the docValues in the schema, but not reindexing, cause
> > errors?  IE, only apply the doc values after the next reindex, but in the
> > meantime keep functioning as there were none until then?
>
> As soon as you specify in the schema that a field has docValues=true, Solr treats all existing documents as having docValues enabled for that field. As there is no docValue content, DocValues-aware functionality such as sorting and faceting will not work for that field, until the documents has been re-indexed.
>
> - Toke Eskildsen
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Toke Eskildsen-2
In reply to this post by Sofiya Strochyk
Sofiya Strochyk <[hidden email]> wrote:
> Target query rate is up to 500 qps, maybe 300, and we need
> to keep response time at <200ms. But at the moment we only
> see very good search performance with up to 100 requests
> per second. Whenever it grows to about 200, average response
> time abruptly increases to 0.5-1 second.

Keep in mind that upping the number of concurrent searches in Solr does not raise throughput, if the system is already saturated. On the contrary, this will lower throughput due to thread- and memory-congestion.

As your machines has 12 cores (including HyperThreading) and IO does not seem to be an issue, 500 or even just 200 concurrent searches seems likely to result in lower throughput than (really guessing here) 100 concurrent searches. As Walther point out, the end result is collapse, but slowdown happens before that.

Consider putting a proxy in front with a max amount of concurrent connections and a sensible queue. Preferably after a bit of testing to locale where the highest throughput is. It won't make you hit your overall goal, but it can move you closer to it.

- Toke Eskildsen
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Shawn Heisey-2
In reply to this post by Sofiya Strochyk
On 10/26/2018 9:55 AM, Sofiya Strochyk wrote:
>
> We have a SolrCloud setup with the following configuration:
>

I'm late to this party.  You've gotten some good replies already.  I
hope I can add something useful.

>   * 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon
>     E5-1650v2, 12 cores, with SSDs)
>   * One collection, 4 shards, each has only a single replica (so 4
>     replicas in total), using compositeId router
>   * Total index size is about 150M documents/320GB, so about 40M/80GB
>     per node
>

With 80GB of index data and one node that only has 64GB of memory, the
full index won't fit into memory on that one server. With approximately
56GB of memory (assuming there's nothing besides Solr running on these
servers and the size of all Java heaps on the system is 8GB) to cache
80GB of index data, performance might be good.  Or it might be
terrible.  It's impossible to predict effectively.

>   * Heap size is set to 8GB.
>

I'm not sure that an 8GB heap is large enough.  Especially given what
you said later about experiencing OOM and seeing a lot of full GCs.

If properly tuned, the G1 collector is overall more efficient than CMS,
but CMS can be quite good.  If GC is not working well with CMS, chances
are that switching to G1 will not help.  The root problem is likely to
be something that a different collector can't fix -- like the heap being
too small.

I wrote the page you referenced for GC tuning.  I have *never* had a
single problem using G1 with Solr.

> Target query rate is up to 500 qps, maybe 300, and we need to keep
> response time at <200ms. But at the moment we only see very good
> search performance with up to 100 requests per second. Whenever it
> grows to about 200, average response time abruptly increases to 0.5-1
> second. (Also it seems that request rate reported by SOLR in admin
> metrics is 2x higher than the real one, because for every query, every
> shard receives 2 requests: one to obtain IDs and second one to get
> data by IDs; so target rate for SOLR metrics would be 1000 qps).
>

Getting 100 requests per second on a single replica is quite good,
especially with a sharded index.  I never could get performance like
that.  To handle hundreds of requests per second, you need several replicas.

If you can reduce the number of shards, the amount of work involved for
a single request will decrease, which MIGHT increase the queries per
second your hardware can handle.  With four shards, one query typically
is actually 9 requests.

Unless your clients are all Java-based, to avoid a single point of
failure, you need a load balancer as well.  (The Java client can talk to
the entire SolrCloud cluster and wouldn't need a load balancer)

What you are seeing where there is a sharp drop in performance from a
relatively modest load increase is VERY common.  This is the way that
almost all software systems behave when faced with extreme loads. 
Search for "knee" on this page:

https://www.oreilly.com/library/view/the-art-of/9780596155858/ch04.html

> During high request load, CPU usage increases dramatically on the SOLR
> nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and
> about 93% on 1 server (random server each time, not the smallest one).
>

Very likely the one with a higher load is the one that is aggregating
shard requests for a full result.

> The documentation mentions replication to spread the load between the
> servers. We tested replicating to smaller servers (32GB RAM, Intel
> Core i7-4770). However, when we tested it, the replicas were going out
> of sync all the time (possibly during commits) and reported errors
> like "PeerSync Recovery was not successful - trying replication." Then
> they proceed with replication which takes hours and the leader handles
> all requests singlehandedly during that time. Also both leaders and
> replicas started encountering OOM errors (heap space) for unknown reason.
>

With only 32GB of memory, assuming 8GB is allocated to the heap, there's
only 24GB to cache the 80GB of index data.  That's not enough, and
performance would be MUCH worse than your 64GB or 128GB machines.

I would suspect extreme GC pauses and/or general performance issues from
not enough cache memory to be the root cause of the sync and recovery
problems.

> Heap dump analysis shows that most of the memory is consumed by [J
> (array of long) type, my best guess would be that it is "_version_"
> field, but it's still unclear why it happens.
>

I'm not familiar enough with how Lucene allocates memory internally to
have any hope of telling you exactly what that memory structure is.

> Also, even though with replication request rate and CPU usage drop 2
> times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers
> (p75_ms is much smaller on nodes with replication, but still not as
> low as under load of <100 requests/s).
>
> Garbage collection is much more active during high load as well. Full
> GC happens almost exclusively during those times. We have tried tuning
> GC options like suggested here
> <https://wiki.apache.org/solr/ShawnHeisey#CMS_.28ConcurrentMarkSweep.29_Collector>
> and it didn't change things though.
>

Symptoms like that generally mean that your heap is too small and needs
to be increased.

>   * How do we increase throughput? Is replication the only solution?
>

Ensuring there's enough memory for caching is the first step.  But that
can only take you so far.  Dealing with the very high query rate you've
got will require multiple replicas.

>   * if yes - then why doesn't it affect response times, considering
>     that CPU is not 100% used and index fits into memory?
>

Hard to say without an in-depth look.  See the end of my reply.

>   * How to deal with OOM and replicas going into recovery?
>

There are precisely two ways to deal with OOM.  One is to increase the
size of the resource that's depleted.  The other is to change things so
that the program doesn't require as much of that resource.  The second
option is frequently not possible.


>   * Is memory or CPU the main problem? (When searching on the
>     internet, i never see CPU as main bottleneck for SOLR, but our
>     case might be different)
>

Most Solr performance issues are memory related.  With an extreme query
rate, CPU can also be a bottleneck, but memory will almost always be the
bottleneck you run into first.

>   * Or do we need smaller shards? Could segments merging be a problem?
>

Smaller shards really won't make much difference in segment merging,
unless the size reduction is *EXTREME* -- switching to a VERY large
number of shards.

If you increase the numbers in your merge policy, then merging will
happen less frequently.  The config that I chose to use was 35 for
maxMergeAtOnce and segmentsPerTier, with 105 for
maxMergeAtOnceExplicit.  The disadvantage to this is that your indexes
will have a LOT more files in them, so it's much easier to run into an
open file limit in the OS.

>   * How to add faceting without search queries slowing down too much?
>

As Erick said ... this isn't possible.  To handle the query load you've
mentioned *with* facets will require even more replicas.  Facets require
more heap memory, more CPU resources, and are likely to access more of
the index data -- which means having plenty of cache memory is even more
important.

>   * How to diagnose these problems and narrow down to the real reason
>     in hardware or setup?
>

A lot of useful information can be obtained from the GC logs that Solr's
built-in scripting creates.  Can you share these logs?

The screenshots described here can also be very useful for troubleshooting:

https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Deepak Goel
In reply to this post by Sofiya Strochyk


On Fri, Oct 26, 2018 at 9:25 PM Sofiya Strochyk <[hidden email]> wrote:

Hi everyone, 

We have a SolrCloud setup with the following configuration:

  • 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon E5-1650v2, 12 cores, with SSDs) 
  • One collection, 4 shards, each has only a single replica (so 4 replicas in total), using compositeId router
  • Total index size is about 150M documents/320GB, so about 40M/80GB per node
  • Zookeeper is on a separate server
  • Documents consist of about 20 fields (most of them are both stored and indexed), average document size is about 2kB
  • Queries are mostly 2-3 words in the q field, with 2 fq parameters, with complex sort expression (containing IF functions)
  • We don't use faceting due to performance reasons but need to add it in the future
  • Majority of the documents are reindexed 2 times/day, as fast as the SOLR allows, in batches of 1000-10000 docs. Some of the documents are also deleted (by id, not by query)
  • autoCommit is set to maxTime of 1 minute with openSearcher=false and autoSoftCommit maxTime is 30 minutes with openSearcher=true. Commits from clients are ignored.
  • Heap size is set to 8GB.

Target query rate is up to 500 qps, maybe 300, and we need to keep response time at <200ms. But at the moment we only see very good search performance with up to 100 requests per second. Whenever it grows to about 200, average response time abruptly increases to 0.5-1 second. (Also it seems that request rate reported by SOLR in admin metrics is 2x higher than the real one, because for every query, every shard receives 2 requests: one to obtain IDs and second one to get data by IDs; so target rate for SOLR metrics would be 1000 qps).

During high request load, CPU usage increases dramatically on the SOLR nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and about 93% on 1 server (random server each time, not the smallest one).

The documentation mentions replication to spread the load between the servers. We tested replicating to smaller servers (32GB RAM, Intel Core i7-4770). However, when we tested it, the replicas were going out of sync all the time (possibly during commits) and reported errors like "PeerSync Recovery was not successful - trying replication." Then they proceed with replication which takes hours and the leader handles all requests singlehandedly during that time. Also both leaders and replicas started encountering OOM errors (heap space) for unknown reason. Heap dump analysis shows that most of the memory is consumed by [J (array of long) type, my best guess would be that it is "_version_" field, but it's still unclear why it happens. Also, even though with replication request rate and CPU usage drop 2 times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers (p75_ms is much smaller on nodes with replication, but still not as low as under load of <100 requests/s).

Garbage collection is much more active during high load as well. Full GC happens almost exclusively during those times. We have tried tuning GC options like suggested here and it didn't change things though.

My questions are

  • How do we increase throughput? Is replication the only solution?
1. Increase the CPU speed
2. Increase the heap size (& tune the GC)
3. Replication 
4. Have one more node on the hardware server (if cpu is not reaching 100%)
  • if yes - then why doesn't it affect response times, considering that CPU is not 100% used and index fits into memory?
  • How to deal with OOM and replicas going into recovery?
1. Increase the heap size
2. Memory debug to check for memory leaks (rare) 
  • Is memory or CPU the main problem? (When searching on the internet, i never see CPU as main bottleneck for SOLR, but our case might be different)
1. Could be both 
  • Or do we need smaller shards? Could segments merging be a problem?
  • How to add faceting without search queries slowing down too much?
  • How to diagnose these problems and narrow down to the real reason in hardware or setup?
1. I would first tune all the software (OS, JVM, Solr) & benchmark the current hardware setup
2. Then i would play around with the hardware to check performance benefits
Any help would be much appreciated.

Increase in response time of 1 sec when you bump up the load indicates Queuing happening in your setup. (Since CPU is not 100% utilised, it most likely indicates memory-disk-network or software problem) 

Last, what is the nature of your request. Are the queries the same? Or they are very random? Random queries would need more tuning than if the queries the same.

Thanks!

--
Sofiia Strochyk



[hidden email]

www.interlogic.com.ua

 





Deepak
"The greatness of a nation can be judged by the way its animals are treated. Please consider stopping the cruelty by becoming a Vegan"


"Plant a Tree, Go Green"


 
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Shalin Shekhar Mangar
In reply to this post by Erick Erickson
What does your cache statistics look like? What's the hit ratio, size,
evictions etc?

More comments inline:

On Sat, Oct 27, 2018 at 8:23 AM Erick Erickson <[hidden email]>
wrote:

> Sofiya:
>
> I haven't said so before, but it's a great pleasure to work with
> someone who's done a lot of homework before pinging the list. The only
> unfortunate bit is that it usually means the simple "Oh, I can fix
> that without thinking about it much" doesn't work ;)
>
> 2.  I'll clarify a bit here. Any TLOG replica can become the leader.
> Here's the process for an update:
> > doc comes in to the leader (may be TLOG)
> > doc is forwarded to all TLOG replicas, _but it is not indexed there_.
> > If the leader fails, the other TLOG replicas have enough documents in
> _their_ tlogs to "catch up" and one is elected
> > You're totally right that PULL replicas cannot become leaders
> > having all TLOG replicas means that the CPU cycles otherwise consumed by
> indexing are available for query processing.
>
> The point here is that TLOG replicas don't need to expend CPU cycles
> to index documents, freeing up all those cycles for serving queries.
>
> Now, that said you report that QPS rate doesn't particularly seem to
> be affected by whether you're indexing or not, so that makes using
> TLOG and PULL replicas less likely to solve your problem. I was
> thinking about your statement that you index as fast as possible....
>
>
> 6. This is a little surprising. Here's my guess: You're  indexing in
> large batches and the batch is only really occupying a thread or two
> so it's effectively serialized thus not consuming a huge amount of
> resources.
>

The CloudSolrClient parallelizes updates to each shard leader. But in this
case, there is only 1 shard so all updates are serialized. All indexing
activity is therefore being performed by a single CPU at a time.


>
> So unless G1 really solves a lot of problems, more replicas are
> indicated. On machines with large amounts of RAM and lots of CPUs, one
> other option is to run multiple JVMs per physical node that's
> sometimes helpful.
>
> One other possibility. In Solr 7.5, you have a ton of metrics
> available. If you hit the admin/metrics end point you'll see 150-200
> available metrics. Apart from running  a profiler to see what's
> consuming the most cycles, the metrics can give you a view into what
> Solr is doing and may help you pinpoint what's using the most cycles.
>
> Best,
> Erick
> On Fri, Oct 26, 2018 at 12:23 PM Toke Eskildsen <[hidden email]> wrote:
> >
> > David Hastings <[hidden email]> wrote:
> > > Would adding the docValues in the schema, but not reindexing, cause
> > > errors?  IE, only apply the doc values after the next reindex, but in
> the
> > > meantime keep functioning as there were none until then?
> >
> > As soon as you specify in the schema that a field has docValues=true,
> Solr treats all existing documents as having docValues enabled for that
> field. As there is no docValue content, DocValues-aware functionality such
> as sorting and faceting will not work for that field, until the documents
> has been re-indexed.
> >
> > - Toke Eskildsen
>


--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: **SPAM** Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk
In reply to this post by Walter Underwood

Hi Walter,

yes, after some point it gets really slow (before reaching 100% CPU usage), so unless G1 or further tuning helps i guess we will have to add more replicas or shards.


On 26.10.18 20:57, Walter Underwood wrote:
The G1 collector should improve 95th percentile performance, because it limits the length of pauses.

With the CMS/ParNew collector, I ran very large Eden spaces, 2 Gb out of an 8 Gb heap. Nearly all of the allocations in Solr have the lifetime of one request, so you don’t want any of those allocations to be promoted to tenured space. Tenured space should be mostly cache evictions and should grow slowly.

For our clusters, when we hit 70% CPU, we add more CPUs. If we drive Solr much harder than that, it goes into congestion collapse. That is totally expected. When you use all of a resource, things get slow. Request more than all of a resource and things get very, very slow.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

On Oct 26, 2018, at 10:21 AM, Sofiya Strochyk [hidden email] wrote:

Thanks Erick,

1. We already use Solr 7.5, upgraded some of our nodes only recently to see if this eliminates the difference in performance (it doesn't, but I'll test and see if the situation with replicas syncing/recovery has improved since then)
2. Yes, we only open searcher once every 30 minutes so it is not an NRT case. But it is only recommended <https://lucene.apache.org/solr/guide/7_1/shards-and-indexing-data-in-solrcloud.html#combining-replica-types-in-a-cluster> to use NRT/TLOG/TLOG+PULL replica types together (currently we have all NRT replicas), would you suggest we change leaders to TLOG and slaves to PULL? And this would also eliminate the redundancy provided by replication because PULL replicas can't become leaders, right?
3. Yes but then it would be reflected in iowait metric, which is almost always near zero on our servers. Is there anything else Solr could be waiting for, and is there a way to check it? If we are going to need even more servers for faster response and faceting then there must be a way to know which resource we should get more of.
5. Yes, docValues are enabled for the fields we sort on (except score which is an internal field); _version_ is left at default i think (type="long" indexed="false" stored="false", and it's also marked as having DocValues in the admin UI)
6. QPS and response time seem to be about the same with and without indexing; server load also looks about the same so i assume indexing doesn't take up a lot of resources (a little strange, but possible if it is limited by network or some other things from point 3).

7. Will try using G1 if nothing else helps... Haven't tested it yet because it is considered unsafe and i'd like to have all other options exhausted first. (And even then it is probably going to be a minor improvement? How much more efficient could it possibly be?)

On 26.10.18 19:18, Erick Erickson wrote:
Some ideas:

1> What version of Solr? Solr 7.3 completely re-wrote Leader Initiated
Recovery and 7.5 has other improvements for recovery, we're hoping
that the recovery situation is much improved.

2> In the 7x code line, there are TLOG and PULL replicas. As of 7.5,
you can set up so the queries are served by replica type, see:
https://issues.apache.org/jira/browse/SOLR-11982 <https://issues.apache.org/jira/browse/SOLR-11982>. This might help you
out. This moves all the indexing to the leader and reserves the rest
of the nodes for queries only, using old-style replication. I'm
assuming from your commit rate that latency between when updates
happen and the updates are searchable isn't a big concern.

3> Just because the CPU isn't 100% doesn't mean Solr is running flat
out. There's I/O waits while sub-requests are serviced and the like.

4> As for how to add faceting without slowing down querying, there's
no way. Extra work is extra work. Depending on _what_ you're faceting
on, you may be able to do some tricks, but without details it's hard
to say. You need to get the query rate target first though ;)

5> OOMs Hmm, you say you're doing complex sorts, are all fields
involved in sorts docValues=true? They have to be to be used in
function queries of course, but what about any fields that aren't?
What about your _version_ field?

6>  bq. "...indexed 2 times/day, as fast as the SOLR allows..." One
experiment I'd run is to test your QPS rate when there was _no_
indexing going on. That would give you a hint as to whether the
TLOG/PULL configuration would be helpful. There's been talk of
separate thread pools for indexing and querying to give queries a
better shot at the CPU, but that's not in place yet.

7> G1GC may also help rather than CMS, but as you're well aware GC
tuning "is more art than science" ;).

Good luck!
Erick

On Fri, Oct 26, 2018 at 8:55 AM Sofiya Strochyk [hidden email] [hidden email] wrote:
Hi everyone,

We have a SolrCloud setup with the following configuration:

4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon E5-1650v2, 12 cores, with SSDs)
One collection, 4 shards, each has only a single replica (so 4 replicas in total), using compositeId router
Total index size is about 150M documents/320GB, so about 40M/80GB per node
Zookeeper is on a separate server
Documents consist of about 20 fields (most of them are both stored and indexed), average document size is about 2kB
Queries are mostly 2-3 words in the q field, with 2 fq parameters, with complex sort expression (containing IF functions)
We don't use faceting due to performance reasons but need to add it in the future
Majority of the documents are reindexed 2 times/day, as fast as the SOLR allows, in batches of 1000-10000 docs. Some of the documents are also deleted (by id, not by query)
autoCommit is set to maxTime of 1 minute with openSearcher=false and autoSoftCommit maxTime is 30 minutes with openSearcher=true. Commits from clients are ignored.
Heap size is set to 8GB.

Target query rate is up to 500 qps, maybe 300, and we need to keep response time at <200ms. But at the moment we only see very good search performance with up to 100 requests per second. Whenever it grows to about 200, average response time abruptly increases to 0.5-1 second. (Also it seems that request rate reported by SOLR in admin metrics is 2x higher than the real one, because for every query, every shard receives 2 requests: one to obtain IDs and second one to get data by IDs; so target rate for SOLR metrics would be 1000 qps).

During high request load, CPU usage increases dramatically on the SOLR nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and about 93% on 1 server (random server each time, not the smallest one).

The documentation mentions replication to spread the load between the servers. We tested replicating to smaller servers (32GB RAM, Intel Core i7-4770). However, when we tested it, the replicas were going out of sync all the time (possibly during commits) and reported errors like "PeerSync Recovery was not successful - trying replication." Then they proceed with replication which takes hours and the leader handles all requests singlehandedly during that time. Also both leaders and replicas started encountering OOM errors (heap space) for unknown reason. Heap dump analysis shows that most of the memory is consumed by [J (array of long) type, my best guess would be that it is "_version_" field, but it's still unclear why it happens. Also, even though with replication request rate and CPU usage drop 2 times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers (p75_ms is much smaller on nodes with replication, but still not as low as under load of <100 requests/s).

Garbage collection is much more active during high load as well. Full GC happens almost exclusively during those times. We have tried tuning GC options like suggested here and it didn't change things though.

My questions are

How do we increase throughput? Is replication the only solution?
if yes - then why doesn't it affect response times, considering that CPU is not 100% used and index fits into memory?
How to deal with OOM and replicas going into recovery?
Is memory or CPU the main problem? (When searching on the internet, i never see CPU as main bottleneck for SOLR, but our case might be different)
Or do we need smaller shards? Could segments merging be a problem?
How to add faceting without search queries slowing down too much?
How to diagnose these problems and narrow down to the real reason in hardware or setup?

Any help would be much appreciated.

Thanks!

--
Sofiia Strochyk



[hidden email] [hidden email]

www.interlogic.com.ua <http://www.interlogic.com.ua/>


-- 
Sofiia Strochyk



[hidden email] [hidden email] 
<menlkogfmjmjjahp.png> 
www.interlogic.com.ua <https://www.interlogic.com.ua/>

<ghlnacnjngimaejn.png> <https://www.facebook.com/InterLogicOfficial>   <kjnfknfjjoeoajce.png> <https://www.linkedin.com/company/interlogic>

    

--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk
In reply to this post by Erick Erickson

Erick,

thanks, i've been pulling my hair out over this for a long time and gathered a lot of information :)

Doesn't there exist a setting for maxIndexingThreads in solrconfig with default value of about 8? It's not clear if my updates are being executed in parallel or not but i would expect them to use at least a few threads.

In the past, we hosted 2 shards on one of the bigger nodes for some time, and this resulted in high load on that node and slow requests from those 2 shards (though not too much worse than now with only 1 shard per node) so they might be too small for handling 2 or more replicas.

Anyway thanks for your help, i'll try profiling and looking into metrics and see if there are some pointers to CPU consumption...


On 27.10.18 05:52, Erick Erickson wrote:
Sofiya:

I haven't said so before, but it's a great pleasure to work with
someone who's done a lot of homework before pinging the list. The only
unfortunate bit is that it usually means the simple "Oh, I can fix
that without thinking about it much" doesn't work ;)

2.  I'll clarify a bit here. Any TLOG replica can become the leader.
Here's the process for an update:
doc comes in to the leader (may be TLOG)
doc is forwarded to all TLOG replicas, _but it is not indexed there_.
If the leader fails, the other TLOG replicas have enough documents in _their_ tlogs to "catch up" and one is elected
You're totally right that PULL replicas cannot become leaders
having all TLOG replicas means that the CPU cycles otherwise consumed by indexing are available for query processing.
The point here is that TLOG replicas don't need to expend CPU cycles
to index documents, freeing up all those cycles for serving queries.

Now, that said you report that QPS rate doesn't particularly seem to
be affected by whether you're indexing or not, so that makes using
TLOG and PULL replicas less likely to solve your problem. I was
thinking about your statement that you index as fast as possible....


6. This is a little surprising. Here's my guess: You're  indexing in
large batches and the batch is only really occupying a thread or two
so it's effectively serialized thus not consuming a huge amount of
resources.

So unless G1 really solves a lot of problems, more replicas are
indicated. On machines with large amounts of RAM and lots of CPUs, one
other option is to run multiple JVMs per physical node that's
sometimes helpful.

One other possibility. In Solr 7.5, you have a ton of metrics
available. If you hit the admin/metrics end point you'll see 150-200
available metrics. Apart from running  a profiler to see what's
consuming the most cycles, the metrics can give you a view into what
Solr is doing and may help you pinpoint what's using the most cycles.

Best,
Erick
On Fri, Oct 26, 2018 at 12:23 PM Toke Eskildsen [hidden email] wrote:
David Hastings [hidden email] wrote:
Would adding the docValues in the schema, but not reindexing, cause
errors?  IE, only apply the doc values after the next reindex, but in the
meantime keep functioning as there were none until then?
As soon as you specify in the schema that a field has docValues=true, Solr treats all existing documents as having docValues enabled for that field. As there is no docValue content, DocValues-aware functionality such as sorting and faceting will not work for that field, until the documents has been re-indexed.

- Toke Eskildsen

--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk
In reply to this post by Toke Eskildsen-2

I think we could try that, but most likely it turns out that at some point we are receiving 300 requests per second, and are able to reasonably handle 150 per second, which means everything else is going to be kept in the growing queue and increase response times even further..

Also, if one node has 12 cores that would mean it can process 12 concurrent searches? And since every request is sent to all shards to check if there are results, does this also mean the whole cluster can handle 12 concurrent requests on average?


On 27.10.18 09:00, Toke Eskildsen wrote:
Sofiya Strochyk [hidden email] wrote:
Target query rate is up to 500 qps, maybe 300, and we need
to keep response time at <200ms. But at the moment we only
see very good search performance with up to 100 requests
per second. Whenever it grows to about 200, average response
time abruptly increases to 0.5-1 second. 
Keep in mind that upping the number of concurrent searches in Solr does not raise throughput, if the system is already saturated. On the contrary, this will lower throughput due to thread- and memory-congestion.

As your machines has 12 cores (including HyperThreading) and IO does not seem to be an issue, 500 or even just 200 concurrent searches seems likely to result in lower throughput than (really guessing here) 100 concurrent searches. As Walther point out, the end result is collapse, but slowdown happens before that.

Consider putting a proxy in front with a max amount of concurrent connections and a sensible queue. Preferably after a bit of testing to locale where the highest throughput is. It won't make you hit your overall goal, but it can move you closer to it.

- Toke Eskildsen

--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Ere Maijala
In reply to this post by Sofiya Strochyk
Hi Sofiya,

You've already received a lot of ideas, but I think this wasn't yet
mentioned: You didn't specify the number of rows your queries fetch or
whether you're using deep paging in the queries. Both can be real
perfomance killers in a sharded index because a large set of records
have to be fetched from all shards. This consumes a relatively high
amount of memory, and even if the servers are able to handle a certain
number of these queries simultaneously, you'd run into garbage
collection trouble with more queries being served. So just one more
thing to be aware of!

Regards,
Ere

Sofiya Strochyk kirjoitti 26.10.2018 klo 18.55:

> Hi everyone,
>
> We have a SolrCloud setup with the following configuration:
>
>   * 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon
>     E5-1650v2, 12 cores, with SSDs)
>   * One collection, 4 shards, each has only a single replica (so 4
>     replicas in total), using compositeId router
>   * Total index size is about 150M documents/320GB, so about 40M/80GB
>     per node
>   * Zookeeper is on a separate server
>   * Documents consist of about 20 fields (most of them are both stored
>     and indexed), average document size is about2kB
>   * Queries are mostly 2-3 words in the q field, with 2 fq parameters,
>     with complex sort expression (containing IF functions)
>   * We don't use faceting due to performance reasons but need to add it
>     in the future
>   * Majority of the documents are reindexed 2 times/day, as fast as the
>     SOLR allows, in batches of 1000-10000 docs. Some of the documents
>     are also deleted (by id, not by query)
>   * autoCommit is set to maxTime of 1 minute with openSearcher=false and
>     autoSoftCommit maxTime is 30 minutes with openSearcher=true. Commits
>     from clients are ignored.
>   * Heap size is set to 8GB.
>
> Target query rate is up to 500 qps, maybe 300, and we need to keep
> response time at <200ms. But at the moment we only see very good search
> performance with up to 100 requests per second. Whenever it grows to
> about 200, average response time abruptly increases to 0.5-1 second.
> (Also it seems that request rate reported by SOLR in admin metrics is 2x
> higher than the real one, because for every query, every shard receives
> 2 requests: one to obtain IDs and second one to get data by IDs; so
> target rate for SOLR metrics would be 1000 qps).
>
> During high request load, CPU usage increases dramatically on the SOLR
> nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and
> about 93% on 1 server (random server each time, not the smallest one).
>
> The documentation mentions replication to spread the load between the
> servers. We tested replicating to smaller servers (32GB RAM, Intel Core
> i7-4770). However, when we tested it, the replicas were going out of
> sync all the time (possibly during commits) and reported errors like
> "PeerSync Recovery was not successful - trying replication." Then they
> proceed with replication which takes hours and the leader handles all
> requests singlehandedly during that time. Also both leaders and replicas
> started encountering OOM errors (heap space) for unknown reason. Heap
> dump analysis shows that most of the memory is consumed by [J (array of
> long) type, my best guess would be that it is "_version_" field, but
> it's still unclear why it happens. Also, even though with replication
> request rate and CPU usage drop 2 times, it doesn't seem to affect
> mean_ms, stddev_ms or p95_ms numbers (p75_ms is much smaller on nodes
> with replication, but still not as low as under load of <100 requests/s).
>
> Garbage collection is much more active during high load as well. Full GC
> happens almost exclusively during those times. We have tried tuning GC
> options like suggested here
> <https://wiki.apache.org/solr/ShawnHeisey#CMS_.28ConcurrentMarkSweep.29_Collector>
> and it didn't change things though.
>
> My questions are
>
>   * How do we increase throughput? Is replication the only solution?
>   * if yes - then why doesn't it affect response times, considering that
>     CPU is not 100% used and index fits into memory?
>   * How to deal with OOM and replicas going into recovery?
>   * Is memory or CPU the main problem? (When searching on the internet,
>     i never see CPU as main bottleneck for SOLR, but our case might be
>     different)
>   * Or do we need smaller shards? Could segments merging be a problem?
>   * How to add faceting without search queries slowing down too much?
>   * How to diagnose these problems and narrow down to the real reason in
>     hardware or setup?
>
> Any help would be much appreciated.
>
> Thanks!
>
> --
> Email Signature
> *Sofiia Strochyk
> *
>
>
> [hidden email] <mailto:[hidden email]>
> InterLogic
> www.interlogic.com.ua <https://www.interlogic.com.ua>
>
> Facebook icon <https://www.facebook.com/InterLogicOfficial> LinkedIn
> icon <https://www.linkedin.com/company/interlogic>
>

--
Ere Maijala
Kansalliskirjasto / The National Library of Finland
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Toke Eskildsen-2
In reply to this post by Sofiya Strochyk
On Mon, 2018-10-29 at 10:55 +0200, Sofiya Strochyk wrote:
> I think we could try that, but most likely it turns out that at some
> point we are receiving 300 requests per second, and are able to
> reasonably handle 150 per second, which means everything else is
> going to be kept in the growing queue and increase response times
> even further..

Just as there should always be an upper limit on concurrent
connections, so should there be a limit on the queue. If your system is
overwhelmed (and you don't have some fancy auto-add-hardware), there
are only two possibilities: Crash the system or discard requests. It is
rarely the case that crashing the system is the acceptable action.

Queueing works to avoid congestion (improving throughput) and to avoid
crashes due to overwhelming. If the queue runs full and needs to
discard requests, turning off the queue would just mean that the system
is overwhelmed instead.

> Also, if one node has 12 cores that would mean it can process 12
> concurrent searches? And since every request is sent to all shards to
> check if there are results, does this also mean the whole cluster can
> handle 12 concurrent requests on average?

It is typically the case that threads goes idle while waiting for data
from memory or storage. This means that you get more performance out of
running more concurrent jobs than the number of CPUs.

How much one should over-provision is very hard to generalize, which is
why I suggest measuring (which of course also takes resources, this
time in the form of work hours). My rough suggestion of a factor 10 for
your system is guesswork erring on the side of a high number.

- Toke Eskildsen, Royal Danish Library


Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk
In reply to this post by Shalin Shekhar Mangar

Hi Shalin,

these are stats for caches used:

documentCache
class:org.apache.solr.search.LRUCache
description:LRU Cache(maxSize=128, initialSize=128)
stats:
CACHE.searcher.documentCache.cumulative_evictions:234923643
CACHE.searcher.documentCache.cumulative_hitratio:0
CACHE.searcher.documentCache.cumulative_hits:876732
CACHE.searcher.documentCache.cumulative_inserts:234997370
CACHE.searcher.documentCache.cumulative_lookups:235874102
CACHE.searcher.documentCache.evictions:2123759
CACHE.searcher.documentCache.hitratio:0
CACHE.searcher.documentCache.hits:3209
CACHE.searcher.documentCache.inserts:2124008
CACHE.searcher.documentCache.lookups:2127217
CACHE.searcher.documentCache.size:128
CACHE.searcher.documentCache.warmupTime:0

filterCache
class:org.apache.solr.search.FastLRUCache
description:Concurrent LRU Cache(maxSize=128, initialSize=128, minSize=115, acceptableSize=121, cleanupThread=false, autowarmCount=64, regenerator=org.apache.solr.search.SolrIndexSearcher$2@418597da)
stats:
CACHE.searcher.filterCache.cumulative_evictions:33848
CACHE.searcher.filterCache.cumulative_hitratio:1
CACHE.searcher.filterCache.cumulative_hits:79684607
CACHE.searcher.filterCache.cumulative_inserts:44931
CACHE.searcher.filterCache.cumulative_lookups:79729534
CACHE.searcher.filterCache.evictions:59
CACHE.searcher.filterCache.hitratio:1
CACHE.searcher.filterCache.hits:708519
CACHE.searcher.filterCache.inserts:118
CACHE.searcher.filterCache.lookups:708637
CACHE.searcher.filterCache.size:123
CACHE.searcher.filterCache.warmupTime:52330

queryResultCache
class:org.apache.solr.search.LRUCache
description:LRU Cache(maxSize=256, initialSize=256, autowarmCount=64, regenerator=org.apache.solr.search.SolrIndexSearcher$3@124adf10)
stats:
CACHE.searcher.queryResultCache.cumulative_evictions:38463897
CACHE.searcher.queryResultCache.cumulative_hitratio:0.01
CACHE.searcher.queryResultCache.cumulative_hits:271649
CACHE.searcher.queryResultCache.cumulative_inserts:38638030
CACHE.searcher.queryResultCache.cumulative_lookups:38909216
CACHE.searcher.queryResultCache.evictions:351561
CACHE.searcher.queryResultCache.hitratio:0.01
CACHE.searcher.queryResultCache.hits:1952
CACHE.searcher.queryResultCache.inserts:353004
CACHE.searcher.queryResultCache.lookups:354874
CACHE.searcher.queryResultCache.size:256
CACHE.searcher.queryResultCache.warmupTime:11724


On 29.10.18 09:51, Shalin Shekhar Mangar wrote:
What does your cache statistics look like? What's the hit ratio, size,
evictions etc?

More comments inline:

On Sat, Oct 27, 2018 at 8:23 AM Erick Erickson [hidden email]
wrote:

Sofiya:

I haven't said so before, but it's a great pleasure to work with
someone who's done a lot of homework before pinging the list. The only
unfortunate bit is that it usually means the simple "Oh, I can fix
that without thinking about it much" doesn't work ;)

2.  I'll clarify a bit here. Any TLOG replica can become the leader.
Here's the process for an update:
doc comes in to the leader (may be TLOG)
doc is forwarded to all TLOG replicas, _but it is not indexed there_.
If the leader fails, the other TLOG replicas have enough documents in
_their_ tlogs to "catch up" and one is elected
You're totally right that PULL replicas cannot become leaders
having all TLOG replicas means that the CPU cycles otherwise consumed by
indexing are available for query processing.

The point here is that TLOG replicas don't need to expend CPU cycles
to index documents, freeing up all those cycles for serving queries.

Now, that said you report that QPS rate doesn't particularly seem to
be affected by whether you're indexing or not, so that makes using
TLOG and PULL replicas less likely to solve your problem. I was
thinking about your statement that you index as fast as possible....


6. This is a little surprising. Here's my guess: You're  indexing in
large batches and the batch is only really occupying a thread or two
so it's effectively serialized thus not consuming a huge amount of
resources.

The CloudSolrClient parallelizes updates to each shard leader. But in this
case, there is only 1 shard so all updates are serialized. All indexing
activity is therefore being performed by a single CPU at a time.


So unless G1 really solves a lot of problems, more replicas are
indicated. On machines with large amounts of RAM and lots of CPUs, one
other option is to run multiple JVMs per physical node that's
sometimes helpful.

One other possibility. In Solr 7.5, you have a ton of metrics
available. If you hit the admin/metrics end point you'll see 150-200
available metrics. Apart from running  a profiler to see what's
consuming the most cycles, the metrics can give you a view into what
Solr is doing and may help you pinpoint what's using the most cycles.

Best,
Erick
On Fri, Oct 26, 2018 at 12:23 PM Toke Eskildsen [hidden email] wrote:
David Hastings [hidden email] wrote:
Would adding the docValues in the schema, but not reindexing, cause
errors?  IE, only apply the doc values after the next reindex, but in
the
meantime keep functioning as there were none until then?
As soon as you specify in the schema that a field has docValues=true,
Solr treats all existing documents as having docValues enabled for that
field. As there is no docValue content, DocValues-aware functionality such
as sorting and faceting will not work for that field, until the documents
has been re-indexed.
- Toke Eskildsen

      


--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
Reply | Threaded
Open this post in threaded view
|

Re: Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk
In reply to this post by Shawn Heisey-2

Hi Shawn,

On 27.10.18 09:28, Shawn Heisey wrote:

With 80GB of index data and one node that only has 64GB of memory, the full index won't fit into memory on that one server. With approximately 56GB of memory (assuming there's nothing besides Solr running on these servers and the size of all Java heaps on the system is 8GB) to cache 80GB of index data, performance might be good.  Or it might be terrible.  It's impossible to predict effectively.

Actually the smallest server doesn't look bad in terms of performance, it has been consistently better that the other ones (without replication) which seems a bit strange (it should be about the same or slightly worse, right?). I guess the memory being smaller than index doesn't cause problems due to the fact that we use SSDs.
I'm not sure that an 8GB heap is large enough.  Especially given what you said later about experiencing OOM and seeing a lot of full GCs.
Increasing the heap is not a problem, but it doesn't seem to make any difference
If you can reduce the number of shards, the amount of work involved for a single request will decrease, which MIGHT increase the queries per second your hardware can handle.  With four shards, one query typically is actually 9 requests.

Unless your clients are all Java-based, to avoid a single point of failure, you need a load balancer as well.  (The Java client can talk to the entire SolrCloud cluster and wouldn't need a load balancer)
What if we are sending requests to machine which is part of the cluster but doesn't host any shards? Does it handle the initial request and merging of the results, or this has to be handled by one of the shards anyway?
Also i was thinking "more shards -> each shard searches smaller set of documents -> search is faster". Or is the overhead for merging results bigger than overhead from searching larger set of documents?
Very likely the one with a higher load is the one that is aggregating shard requests for a full result.
Is there a way to confirm this? Maybe the aggregating shard is going to have additional requests in its solr.log?
Most Solr performance issues are memory related.  With an extreme query rate, CPU can also be a bottleneck, but memory will almost always be the bottleneck you run into first.
This is the advice i've seen often, but how exactly can we run out of memory if total RAM is 128, heap is 8 and index size is 80. Especially since node with 64G runs just as fine if not better.
A lot of useful information can be obtained from the GC logs that Solr's built-in scripting creates.  Can you share these logs?

The screenshots described here can also be very useful for troubleshooting:

https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue
I have attached some GC logs and screenshots, hope these are helpful (can only attach small files)


solr_gc.log.7.1 (565K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk
In reply to this post by Deepak Goel

Hi Deepak and thanks for your reply,


On 27.10.18 10:35, Deepak Goel wrote:

Last, what is the nature of your request. Are the queries the same? Or they are very random? Random queries would need more tuning than if the queries the same.
The search term (q) is different for each query, and filter query terms (fq) are repeated very often. (so we have very little cache hit ratio for query result cache, and very high hit ratio for filter cache)

--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
123