Recovering shards from down state

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Recovering shards from down state

decenttp
I have a 3 server (debian) ensemble using zookeeper and solr 6.6.0 on aws
cloud. Setup has 3 shards per server with a replication factor of 3. It has
around 11 collections 2 of which are large having over 5 million records
each. Since i was maxing on the ram i tried to launch the servers with a
higher configuration (64gb ram each). Everything went fine and i was able to
have it all started again. However, shards from those 2 larger collections
had an issue and started with a down state.

I am getting this in my logs:

Time (Local) Level Core Logger Message
11/10/2017, 5:16:59 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:16:59 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:16:59 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:17:00 PM
ERROR false
RecoveryStrategy
Error while trying to recover.
core=pub_match_shard3_replica3:org.apache.solr.common.SolrException: No
registered leader was found after waiting for 4000ms ,​ collection:
pub_match slice: shard3
11/10/2017, 5:17:00 PM
ERROR false
RecoveryStrategy
Recovery failed - trying again... (11)
11/10/2017, 5:17:02 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,​ collection: pub_match slice: shard1
11/10/2017, 5:17:04 PM
ERROR false
SolrCmdDistributor
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://xx.xx.xx.xx:8983/solr/pub_match_shard1_replica1:
Expected mime type application/octet-stream but got text/html. <html>
11/10/2017, 5:17:04 PM
ERROR false
SolrCmdDistributor
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://xx.xx.xx.xx:8983/solr/pub_match_shard3_replica1:
Expected mime type application/octet-stream but got text/html. <html>
11/10/2017, 5:17:04 PM
ERROR false
SolrCmdDistributor
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://xx.xx.xx.xx:8983/solr/pub_match_shard1_replica1:
Expected mime type application/octet-stream but got text/html. <html>
11/10/2017, 5:17:04 PM
ERROR false
SolrCmdDistributor
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://xx.xx.xx.xx:8983/solr/pub_match_shard3_replica1:
Expected mime type application/octet-stream but got text/html. <html>
11/10/2017, 5:17:05 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:17:05 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:17:05 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:17:05 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:17:05 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard3
11/10/2017, 5:17:12 PM
WARN false
ReplicationHandler
Exception while writing response for params:
generation=191910&checksum=true&qt=/replication&file=_4vc8.si&wt=filestream&command=filecontent
11/10/2017, 5:17:12 PM
WARN false
ReplicationHandler
Exception while writing response for params:
generation=191910&checksum=true&qt=/replication&file=_4vc8.si&wt=filestream&command=filecontent
11/10/2017, 5:17:12 PM
WARN false
ReplicationHandler
Exception while writing response for params:
generation=191910&checksum=true&qt=/replication&file=_4vc8.si&wt=filestream&command=filecontent
11/10/2017, 5:17:12 PM
WARN false
ReplicationHandler
Exception while writing response for params:
generation=191910&checksum=true&qt=/replication&file=_4vc8.si&wt=filestream&command=filecontent
11/10/2017, 5:17:12 PM
WARN false
ReplicationHandler
Exception while writing response for params:
generation=191910&checksum=true&qt=/replication&file=_4vc8.si&wt=filestream&command=filecontent
11/10/2017, 5:17:12 PM
WARN false
ReplicationHandler
Exception while writing response for params:
generation=191910&checksum=true&qt=/replication&file=_4vc8.si&wt=filestream&command=filecontent
11/10/2017, 5:17:12 PM
WARN false
ReplicationHandler
Exception while writing response for params:
generation=191910&checksum=true&qt=/replication&file=_4vc8.si&wt=filestream&command=filecontent
11/10/2017, 5:17:13 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard1
11/10/2017, 5:17:16 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard1
11/10/2017, 5:17:19 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard3
11/10/2017, 5:17:22 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard1
11/10/2017, 5:17:23 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard3
11/10/2017, 5:17:23 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard1
11/10/2017, 5:17:24 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard3
11/10/2017, 5:17:25 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard3
11/10/2017, 5:17:28 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard3
11/10/2017, 5:17:30 PM
ERROR false
RequestHandlerBase
org.apache.solr.common.SolrException: No registered leader was found after
waiting for 4000ms ,&#8203; collection: pub_match slice: shard3
11/10/2017, 5:17:31 PM
ERROR false
SolrCmdDistributor
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://xx.xx.xx.xx:8983/solr/pub_match_shard3_replica1:
Expected mime type application/octet-stream but got text/html. <html>
11/10/2017, 5:17:31 PM
ERROR false
SolrCmdDistributor
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://xx.xx.xx.xx:8983/solr/pub_match_shard1_replica1:
Expected mime type application/octet-stream but got text/html. <html>
11/10/2017, 5:17:33 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr
11/10/2017, 5:17:33 PM
WARN false
DistributedUpdateProcessor
Error sending update to http://xx.xx.xx.xx:8983/solr


Your help is highly appreciated.

Thanks

Row



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html