Solr cloud setup

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr cloud setup

Midas A
Hi ,

Currently we are in master slave architechture we want to move in solr
cloud architechture .
how i should decide shard number in solr cloud ?

My current solr in version 6 and index size is 300 GB.



Regards,
Abhishek Tiwari
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud setup

Emir Arnautović
Hi Abhishek,
Here is a nice blog post about migrating to SolrCloud: https://sematext.com/blog/solr-master-slave-solrcloud-migration/ <https://sematext.com/blog/solr-master-slave-solrcloud-migration/>

Re number of shards - there is no definite answer - it depends on your indexing/search latency requirements. Only tests can tell. Here are some thought on how to perform tests: https://www.od-bits.com/2018/01/solrelasticsearch-capacity-planning.html <https://www.od-bits.com/2018/01/solrelasticsearch-capacity-planning.html>

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 7 Jun 2019, at 09:05, Midas A <[hidden email]> wrote:
>
> Hi ,
>
> Currently we are in master slave architechture we want to move in solr
> cloud architechture .
> how i should decide shard number in solr cloud ?
>
> My current solr in version 6 and index size is 300 GB.
>
>
>
> Regards,
> Abhishek Tiwari

Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud setup

Erick Erickson
First of all, do not shard unless necessary to handle your QPS requirements. Sharding adds overhead and has some functionality limitations. How to define “necessary”? Load test a single shard (or even stand-alone with a single core) until it falls over. See: https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ for an outline of the process.

“handle your QPS rate” is a bit tricky. What I’m talking about there is the ability to
1> index at an adequate speed
2> get queries back with acceptable latency.

Let’s say you test and can get 20 queries per second on a single shard, but need 100 QPS. Then add 4 more _replicas_ (not shards) to that single-sharded system for a total of 5 replicas x 1 shard.

My general expectation (and YMMV) is for 50M docs/shard. I’ve seen 300M docs on a single shard and 10M so the range is very wide depending on your particular needs. Given your index size, you’re in the range where sharding becomes desirable, but you have to test first.

Finally, note that there’s quite a jump going from  1 replica (leader only)  to 2 in terms of indexing. The leader has to forward the docs to the follower and that shows up. In very heavy indexing scenarios I’ve seen this matter, if it does in your situation consider TLOG or PULL replica types.

Best,
Erick

> On Jun 7, 2019, at 1:53 AM, Emir Arnautović <[hidden email]> wrote:
>
> Hi Abhishek,
> Here is a nice blog post about migrating to SolrCloud: https://sematext.com/blog/solr-master-slave-solrcloud-migration/ <https://sematext.com/blog/solr-master-slave-solrcloud-migration/>
>
> Re number of shards - there is no definite answer - it depends on your indexing/search latency requirements. Only tests can tell. Here are some thought on how to perform tests: https://www.od-bits.com/2018/01/solrelasticsearch-capacity-planning.html <https://www.od-bits.com/2018/01/solrelasticsearch-capacity-planning.html>
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 7 Jun 2019, at 09:05, Midas A <[hidden email]> wrote:
>>
>> Hi ,
>>
>> Currently we are in master slave architechture we want to move in solr
>> cloud architechture .
>> how i should decide shard number in solr cloud ?
>>
>> My current solr in version 6 and index size is 300 GB.
>>
>>
>>
>> Regards,
>> Abhishek Tiwari
>