SolrCloud scaling/optimization for high request rate

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: **SPAM** Re: SolrCloud scaling/optimization for high request rate

Ere Maijala
 From what I've gathered and what's been my experience docValues should
be enabled, but if you can't think of anything else, I'd try turning
them off to see if it makes any difference. As far as I can recall
turning them off will increase usage of Solr's own caches and that
caused noticeable slowdown for us, but your mileage may vary.

--Ere

Sofiya Strochyk kirjoitti 12.11.2018 klo 14.23:

> Thanks for the suggestion Ere. It looks like they are actually enabled;
> in schema file the field is only marked as stored (field name="_id"
> type="string" multiValued="false" indexed="true" required="true"
> stored="true") but the admin UI shows DocValues as enabled, so I guess
> this is by default. Is the solution to add "docValues=false" in the schema?
>
>
> On 12.11.18 10:43, Ere Maijala wrote:
>> Sofiya,
>>
>> Do you have docValues enabled for the id field? Apparently that can
>> make a significant difference. I'm failing to find the relevant
>> references right now, but just something worth checking out.
>>
>> Regards,
>> Ere
>>
>> Sofiya Strochyk kirjoitti 6.11.2018 klo 16.38:
>>> Hi Toke,
>>>
>>> sorry for the late reply. The query i wrote here is edited to hide
>>> production details, but I can post additional info if this helps.
>>>
>>> I have tested all of the suggested changes none of these seem to make
>>> a noticeable difference (usually response time and other metrics
>>> fluctuate over time, and the changes caused by different parameters
>>> are smaller than the fluctuations). What this probably means is that
>>> the heaviest task is retrieving IDs by query and not fields by ID.
>>> I've also checked QTime logged for these types of operations, and it
>>> is much higher for "get IDs by query" than for "get fields by IDs
>>> list". What could be done about this?
>>>
>>> On 05.11.18 14:43, Toke Eskildsen wrote:
>>>> So far no answer from Sofiya. That's fair enough: My suggestions might
>>>> have seemed random. Let me try to qualify them a bit.
>>>>
>>>>
>>>> What we have to work with is the redacted query
>>>> q=<q expression>&fl=<full list of fields>&start=0&sort=<sort
>>>> expression>&fq=<fq expression>&rows=24&version=2.2&wt=json
>>>> and an earlier mention that sorting was complex.
>>>>
>>>> My suggestions were to try
>>>>
>>>> 1) Only request simple sorting by score
>>>>
>>>> If this improves performance substantially, we could try and see if
>>>> sorting could be made more efficient: Reducing complexity, pre-
>>>> calculating numbers etc.
>>>>
>>>> 2) Reduce rows to 0
>>>> 3) Increase rows to 100
>>>>
>>>> This measures one aspect of retrieval. If there is a big performance
>>>> difference between these two, we can further probe if the problem is
>>>> the number or size of fields - perhaps there is a ton of stored text,
>>>> perhaps there is a bunch of DocValued fields?
>>>>
>>>> 4) Set fl=id only
>>>>
>>>> This is a variant of 2+3 to do a quick check if it is the resolving of
>>>> specific field values that is the problem. If using fl=id speeds up
>>>> substantially, the next step would be to add fields gradually until
>>>> (hopefully) there is a sharp performance decrease.
>>>>
>>>> - Toke Eskildsen, Royal Danish Library
>>>>
>>>>
>>>
>>> --
>>> Email Signature
>>> *Sofiia Strochyk
>>> *
>>>
>>>
>>> [hidden email] <mailto:[hidden email]>
>>>     InterLogic
>>> www.interlogic.com.ua <https://www.interlogic.com.ua>
>>>
>>> Facebook icon <https://www.facebook.com/InterLogicOfficial> LinkedIn
>>> icon <https://www.linkedin.com/company/interlogic>
>>>
>>
>
> --
> Email Signature
> *Sofiia Strochyk
> *
>
>
> [hidden email] <mailto:[hidden email]>
> InterLogic
> www.interlogic.com.ua <https://www.interlogic.com.ua>
>
> Facebook icon <https://www.facebook.com/InterLogicOfficial> LinkedIn
> icon <https://www.linkedin.com/company/interlogic>
>

--
Ere Maijala
Kansalliskirjasto / The National Library of Finland
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Toke Eskildsen-2
In reply to this post by Sofiya Strochyk
On Mon, 2018-11-12 at 14:19 +0200, Sofiya Strochyk wrote:
> I'll check if the filter queries or the main query tokenizers/filters
> might have anything to do with this, but I'm afraid query
> optimization can only get us so far.

Why do you think that? As you tried eliminating sorting and retrieval
previously, the queries are all that's left. There are multiple
performance traps when querying and a lot of them can be bypassed by
changing the index or querying in a different way.

> Since we will have to add facets later, the queries will only become
> heavier, and there has to be a way to scale this setup and deal with
> both higher load and more complex queries.

There is of course a way. It is more a question of what you are willing
to pay.

If you have money, just buy more hardware: We know (with very high
probability) that it will work as your problem is search throughput,
which can be solved by adding more replicas on extra machines.

If you have more engineering hours, you can use them on some of the
things discussed previously:

* Pinpoint query bottlenecks
* Use less/more shards
* Apply https://issues.apache.org/jira/browse/LUCENE-8374
* Experiment with different amounts of concurrent requests to see what
gives the optimum throughput. This also tells you how much extra
hardware you need, if you decide you need to expand..


- Toke Eskildsen, Royal Danish Library


Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud scaling/optimization for high request rate

Sofiya Strochyk

Thanks to everyone for the suggestions. We managed to get the performance to a bearable level by splitting the index into ~20 separate collections (one collection per country) and spreading them between existing servers as evenly as possible. The largest country is also split into 2 shards. This means that

1. QPS is lower for each instance since it only receives requests to the corresponding country.

2. Index size is smaller for each instance as it only contains documents for the corresponding country.

3. If one instance fails then most of the other instances keep running (possibly except the ones colocated with the failed one)

We didn't make any changes to the main query, but have added a few fields to facet on. This had a small negative impact on performance but overall kept working nicely.


On 14.11.18 12:18, Toke Eskildsen wrote:
On Mon, 2018-11-12 at 14:19 +0200, Sofiya Strochyk wrote:
I'll check if the filter queries or the main query tokenizers/filters
might have anything to do with this, but I'm afraid query
optimization can only get us so far. 
Why do you think that? As you tried eliminating sorting and retrieval
previously, the queries are all that's left. There are multiple
performance traps when querying and a lot of them can be bypassed by
changing the index or querying in a different way.

Since we will have to add facets later, the queries will only become
heavier, and there has to be a way to scale this setup and deal with
both higher load and more complex queries. 
There is of course a way. It is more a question of what you are willing
to pay.

If you have money, just buy more hardware: We know (with very high
probability) that it will work as your problem is search throughput,
which can be solved by adding more replicas on extra machines.

If you have more engineering hours, you can use them on some of the
things discussed previously:

* Pinpoint query bottlenecks
* Use less/more shards
* Apply https://issues.apache.org/jira/browse/LUCENE-8374
* Experiment with different amounts of concurrent requests to see what
gives the optimum throughput. This also tells you how much extra
hardware you need, if you decide you need to expand..


- Toke Eskildsen, Royal Danish Library



--
Email Signature
Sofiia Strochyk



[hidden email]
InterLogic
www.interlogic.com.ua

Facebook icon   LinkedIn icon
123