I need some advice with my solrcloud cluster and the DIH. I have a cluster with 3 cloud servers. Every server has an solr instance and a zookeeper instance. I start it with the -Dzkhost parameter. It works great, i send updates by an curl(xml) like this:
Solr has 2 million docs in the index. Now i want a extra field: content2. I add this in my schema and upload this again to the cluster with -Dbootstrap_confdir and -Dcollection.configName. It's replicated to the whole cluster.
Now i need a re-index to add the field to every doc. I have a database with all the data and want to use the full-import of DIH(this was the way i did this in previous solr versions). When i run this it goes with 3 doc/s(Really slow). When i run solr alone(not solrcloud) it goes 600 docs/sec.
What's the best way to do a full re-index with solrcloud? Does solrcloud support DIH?
> When i run this it goes with 3 doc/s(Really
> slow). When i run solr alone(not solrcloud) it goes 600 docs/sec.
> What's the best way to do a full re-index with solrcloud? Does solrcloud
> support DIH?
SolrCloud supports DIH, but not fully and happily. It's setup to work pretty nicely with non SolrCloud - it will load pretty quick - with SolrCloud a few things can happen - one is that you might be running DIH on a replica rather than a leader - and that can change without your consent - in this case all docs will go to another node and then come back. SolrCloud also works best with multiple threads really - DIH will only use one to my knowledge.
Still, at 3 docs/s, something sounds wrong. That's too slow.