Solr 6.6.0 - Indexing errors

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Solr 6.6.0 - Indexing errors

Joe Obernberger
We've been indexing data on a 45 node cluster with 100 shards and 3
replicas, but our indexing processes have been stopping due to errors.  
On the server side the error is "Error logging add". Stack trace:

2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
s:shard58 r:core_node290 x:UNCLASS_shard58_replica1]
o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard58_replica1]
webapp=/solr path=/update
params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[COLLECT20003218348784
(1573172872544780288), COLLECT20003218351447 (1573172872620277760),
COLLECT20003218353085 (1573172872625520640), COLLECT20003218357937
(1573172872627617792), COLLECT20003218361860 (1573172872629714944),
COLLECT20003218362535 (1573172872631812096)]} 0 171
2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
s:shard13 r:core_node81 x:UNCLASS_shard13_replica1]
o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard13_replica1]
webapp=/solr path=/update
params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[COLLECT20003218344436
(1573172872538488832), COLLECT20003218347497 (1573172872620277760),
COLLECT20003218351645 (1573172872625520640), COLLECT20003218356965
(1573172872629714944), COLLECT20003218357775 (1573172872632860672),
COLLECT20003218358017 (1573172872646492160), COLLECT20003218358152
(1573172872650686464), COLLECT20003218359395 (1573172872651735040),
COLLECT20003218362571 (1573172872652783616)]} 0 274
2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
s:shard43 r:core_node108 x:UNCLASS_shard43_replica1]
o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard43_replica1]
webapp=/solr path=/update
params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard43_replica2/&wt=javabin&version=2}{}
0 0
2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
s:shard43 r:core_node108 x:UNCLASS_shard43_replica1]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error
logging add
         at
org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
         at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
         at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
         at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
         at
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
         at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
         at
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
         at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
         at
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
         at
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
         at
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
         at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
         at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
         at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
         at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
         at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
         at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
         at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
         at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
         at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
         at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
         at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
         at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
         at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
         at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
         at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
         at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
         at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
         at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
         at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
         at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at org.eclipse.jetty.server.Server.handle(Server.java:534)
         at
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
         at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
         at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
         at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
         at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
         at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
         at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
         at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
         at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
         at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
could only be replicated to 0 nodes instead of minReplication (=1).  
There are 40 datanode(s) running and no node(s) are excluded in this
operation.
         at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
         at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
         at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
         at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
         at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
         at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
         at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:422)
         at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)

         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
         at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
         at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:498)
         at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
         at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
         at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
         at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
         at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)

2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
s:shard43 r:core_node108 x:UNCLASS_shard43_replica1]
o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error
logging add
         at
org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
         at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
         at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
         at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
         at
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
         at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
         at
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
         at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
         at
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
         at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
         at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
         at
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
         at
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
         at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
         at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
         at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
         at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
         at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
         at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
         at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
         at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
         at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
         at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
         at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
         at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
         at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
         at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
         at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
         at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
         at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
         at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
         at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
         at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
         at org.eclipse.jetty.server.Server.handle(Server.java:534)
         at
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
         at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
         at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
         at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
         at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
         at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
         at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
         at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
         at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
         at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
could only be replicated to 0 nodes instead of minReplication (=1).  
There are 40 datanode(s) running and no node(s) are excluded in this
operation.
         at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
         at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
         at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
         at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
         at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
         at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
         at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:422)
         at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)

         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
         at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
         at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:498)
         at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
         at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
         at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
         at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
         at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)

2017-07-17 12:29:24.187 INFO
(zkCallback-5-thread-144-processing-n:juliet:9100_solr) [   ]
o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
state:SyncConnected type:NodeDataChanged
path:/collections/UNCLASS/state.json] for collection [UNCLASS] has
occurred - updating... (live nodes size: [45])

On the client side, the error looks like:
2017-07-16 19:03:16,118 WARN
[com.ngc.bigdata.ie_solrindexer.IndexDocument] Indexing error:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
writing document id COLLECT10086453202 to the index; possible analysis
error. for collection: UNCLASS
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
writing document id COLLECT10086453202 to the index; possible analysis
error.
         at
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:819)
         at
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1263)
         at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1134)
         at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1073)
         at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71)
         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:85)
         at
com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(IndexDocument.java:959)
         at
com.ngc.bigdata.ie_solrindexer.IndexDocument.index(IndexDocument.java:236)
         at
com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(SolrIndexerProcessor.java:63)
         at
com.ngc.intelenterprise.intelentutil.utils.Processor.run(Processor.java:140)
         at
com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.process(IntelEntQueueProc.java:208)
         at
org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
         at
org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)
         at
org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:460)
         at
org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
         at
org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
         at
org.apache.camel.component.seda.SedaConsumer.sendToConsumers(SedaConsumer.java:298)
         at
org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsumer.java:207)
         at
org.apache.camel.component.seda.SedaConsumer.run(SedaConsumer.java:154)
         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
         at java.lang.Thread.run(Thread.java:748)
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
Exception writing document id COLLECT10086453202 to the index; possible
analysis error.
         at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:610)
         at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
         at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
         at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:447)
         at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:388)
         at
org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$directUpdate$0(CloudSolrClient.java:796)
         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
         at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
         ... 3 more
2017-07-16 19:03:16,134 ERROR
[com.ngc.bigdata.ie_solrindexer.IndexDocument] Error indexing:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
writing document id COLLECT10086453202 to the index; possible analysis
error. for collection: UNCLASS.
2017-07-16 19:03:16,135 ERROR
[com.ngc.bigdata.ie_solrindexer.IndexDocument] Exception during
indexing:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
writing document id COLLECT10086453202 to the index; possible analysis
error.

I can fire them back up, but they only run for a short time before
getting more indexing errors.  Several of the nodes show as down in the
cloud view.  Any help would be appreciated!  Thank you!


-Joe

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Joe Obernberger
Some more info:

When I stop all the indexers, in about 5-10 minutes the cluster goes all
green.  When I start just one indexer, several nodes immediately go down
with the 'Error adding log' message.

I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
indexing.  Is this correct for SolrCloud?

Thank you!

-Joe


On 7/17/2017 8:36 AM, Joe Obernberger wrote:

> We've been indexing data on a 45 node cluster with 100 shards and 3
> replicas, but our indexing processes have been stopping due to
> errors.  On the server side the error is "Error logging add". Stack
> trace:
>
> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
> s:shard58 r:core_node290 x:UNCLASS_shard58_replica1]
> o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard58_replica1]
> webapp=/solr path=/update
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[COLLECT20003218348784
> (1573172872544780288), COLLECT20003218351447 (1573172872620277760),
> COLLECT20003218353085 (1573172872625520640), COLLECT20003218357937
> (1573172872627617792), COLLECT20003218361860 (1573172872629714944),
> COLLECT20003218362535 (1573172872631812096)]} 0 171
> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
> s:shard13 r:core_node81 x:UNCLASS_shard13_replica1]
> o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard13_replica1]
> webapp=/solr path=/update
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[COLLECT20003218344436
> (1573172872538488832), COLLECT20003218347497 (1573172872620277760),
> COLLECT20003218351645 (1573172872625520640), COLLECT20003218356965
> (1573172872629714944), COLLECT20003218357775 (1573172872632860672),
> COLLECT20003218358017 (1573172872646492160), COLLECT20003218358152
> (1573172872650686464), COLLECT20003218359395 (1573172872651735040),
> COLLECT20003218362571 (1573172872652783616)]} 0 274
> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
> s:shard43 r:core_node108 x:UNCLASS_shard43_replica1]
> o.a.s.u.p.LogUpdateProcessorFactory [UNCLASS_shard43_replica1]
> webapp=/solr path=/update
> params={update.distrib=FROMLEADER&update.chain=add-unknown-fields-to-the-schema&distrib.from=http://tarvos:9100/solr/UNCLASS_shard43_replica2/&wt=javabin&version=2}{}
> 0 0
> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
> s:shard43 r:core_node108 x:UNCLASS_shard43_replica1]
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error
> logging add
>         at
> org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
>         at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>         at
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>         at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
>         at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>         at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
>         at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>         at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
>         at
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>         at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>         at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>         at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>         at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
>         at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>         at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>         at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>         at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>         at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>         at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>         at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>         at
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>         at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
> File
> /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
> could only be replicated to 0 nodes instead of minReplication (=1).  
> There are 40 datanode(s) running and no node(s) are excluded in this
> operation.
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
>         at
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
>
> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
> s:shard43 r:core_node108 x:UNCLASS_shard43_replica1]
> o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error
> logging add
>         at
> org.apache.solr.update.TransactionLog.write(TransactionLog.java:418)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1113)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:748)
>         at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>         at
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>         at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:306)
>         at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>         at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:271)
>         at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>         at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
>         at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>         at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
>         at
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>         at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>         at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>         at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>         at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
>         at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
>         at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>         at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>         at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>         at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>         at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>         at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>         at
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>         at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
> File
> /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
> could only be replicated to 0 nodes instead of minReplication (=1).  
> There are 40 datanode(s) running and no node(s) are excluded in this
> operation.
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1622)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3351)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
>         at
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1459)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1255)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
>
> 2017-07-17 12:29:24.187 INFO
> (zkCallback-5-thread-144-processing-n:juliet:9100_solr) [   ]
> o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
> state:SyncConnected type:NodeDataChanged
> path:/collections/UNCLASS/state.json] for collection [UNCLASS] has
> occurred - updating... (live nodes size: [45])
>
> On the client side, the error looks like:
> 2017-07-16 19:03:16,118 WARN
> [com.ngc.bigdata.ie_solrindexer.IndexDocument] Indexing error:
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index;
> possible analysis error. for collection: UNCLASS
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index;
> possible analysis error.
>         at
> org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:819)
>         at
> org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1263)
>         at
> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:1134)
>         at
> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:1073)
>         at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:160)
>         at
> org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
>         at
> org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71)
>         at
> org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:85)
>         at
> com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(IndexDocument.java:959)
>         at
> com.ngc.bigdata.ie_solrindexer.IndexDocument.index(IndexDocument.java:236)
>         at
> com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(SolrIndexerProcessor.java:63)
>         at
> com.ngc.intelenterprise.intelentutil.utils.Processor.run(Processor.java:140)
>         at
> com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.process(IntelEntQueueProc.java:208)
>         at
> org.apache.camel.processor.DelegateSyncProcessor.process(DelegateSyncProcessor.java:63)
>         at
> org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)
>         at
> org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:460)
>         at
> org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
>         at
> org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)
>         at
> org.apache.camel.component.seda.SedaConsumer.sendToConsumers(SedaConsumer.java:298)
>         at
> org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsumer.java:207)
>         at
> org.apache.camel.component.seda.SedaConsumer.run(SedaConsumer.java:154)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index;
> possible analysis error.
>         at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:610)
>         at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:279)
>         at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:268)
>         at
> org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:447)
>         at
> org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:388)
>         at
> org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$directUpdate$0(CloudSolrClient.java:796)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
>         ... 3 more
> 2017-07-16 19:03:16,134 ERROR
> [com.ngc.bigdata.ie_solrindexer.IndexDocument] Error indexing:
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index;
> possible analysis error. for collection: UNCLASS.
> 2017-07-16 19:03:16,135 ERROR
> [com.ngc.bigdata.ie_solrindexer.IndexDocument] Exception during
> indexing:
> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3: 
> Exception writing document id COLLECT10086453202 to the index;
> possible analysis error.
>
> I can fire them back up, but they only run for a short time before
> getting more indexing errors.  Several of the nodes show as down in
> the cloud view.  Any help would be appreciated!  Thank you!
>
>
> -Joe
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Shawn Heisey-2
In reply to this post by Joe Obernberger
On 7/17/2017 6:36 AM, Joe Obernberger wrote:
> We've been indexing data on a 45 node cluster with 100 shards and 3
> replicas, but our indexing processes have been stopping due to
> errors.  On the server side the error is "Error logging add". Stack
> trace:
 <snip>
> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
> File
> /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
> could only be replicated to 0 nodes instead of minReplication (=1).
> There are 40 datanode(s) running and no node(s) are excluded in this
> operation.

The excerpt from your log that I preserved above shows that the root of
the problem is something going wrong with Solr writing to HDFS.  I can
only tell that there was a problem, I do not what actually went wrong.

I think you'll need to take this information to the hadoop project and
ask them what could cause it and what can be done about it.

Solr includes Hadoop 2.7.2 jars.  This is not the latest version of
Hadoop, so it's possible there might be a known issue with this version
that is fixed in a later version.  There is a task to update Solr's
Hadoop to 3.0 when it gets released:

https://issues.apache.org/jira/browse/SOLR-9515

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Susheel Kumar-3
In reply to this post by Joe Obernberger
There is some analysis error also.  I would suggest to test the indexer on
just one shard setup first, then test for a replica (1 shard and 1 replica)
and then test for 2 shards and 2 replica.  This would confirm if there is
basic issue with indexing / cluster setup.

On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
[hidden email]> wrote:

> Some more info:
>
> When I stop all the indexers, in about 5-10 minutes the cluster goes all
> green.  When I start just one indexer, several nodes immediately go down
> with the 'Error adding log' message.
>
> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
> indexing.  Is this correct for SolrCloud?
>
> Thank you!
>
> -Joe
>
>
>
> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>
>> We've been indexing data on a 45 node cluster with 100 shards and 3
>> replicas, but our indexing processes have been stopping due to errors.  On
>> the server side the error is "Error logging add". Stack trace:
>>
>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS s:shard58
>> r:core_node290 x:UNCLASS_shard58_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0 171
>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS s:shard13
>> r:core_node81 x:UNCLASS_shard13_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0 274
>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS s:shard43
>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>> org.apache.solr.common.SolrException: Error logging add
>>         at org.apache.solr.update.TransactionLog.write(TransactionLog.
>> java:418)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> versionAdd(DistributedUpdateProcessor.java:1113)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> processAdd(DistributedUpdateProcessor.java:748)
>>         at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>         at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>> nLoader.java:98)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:306)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:271)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>> dec.java:173)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>         at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>> s(JavabinLoader.java:108)
>>         at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>> der.java:55)
>>         at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>> questHandler.java:97)
>>         at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>> stBody(ContentStreamHandlerBase.java:68)
>>         at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>> uestHandlerBase.java:173)
>>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>         at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>> java:723)
>>         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>> 529)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:361)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:305)
>>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>> r(ServletHandler.java:1691)
>>         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>> dler.java:582)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:143)
>>         at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>> ndler.java:548)
>>         at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>> SessionHandler.java:226)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>> ContextHandler.java:1180)
>>         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>> ler.java:512)
>>         at org.eclipse.jetty.server.session.SessionHandler.doScope(
>> SessionHandler.java:185)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>> ContextHandler.java:1112)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:141)
>>         at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>> ndle(ContextHandlerCollection.java:213)
>>         at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>> HandlerCollection.java:119)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>> iteHandler.java:335)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>> java:320)
>>         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>> ction.java:251)
>>         at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>> succeeded(AbstractConnection.java:273)
>>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>> java:95)
>>         at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>> elEndPoint.java:93)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .produceConsume(ExecuteProduceConsume.java:148)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .run(ExecuteProduceConsume.java:136)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>> ThreadPool.java:671)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>> hreadPool.java:589)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>> hooseTarget4NewBlock(BlockManager.java:1622)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>> ionalBlock(FSNamesystem.java:3351)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>> Block(NameNodeRpcServer.java:683)
>>         at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>> tProtocol.java:214)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>> TranslatorPB.java:495)
>>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>> enodeProtocolProtos.java)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>> voker.call(ProtobufRpcEngine.java:617)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:422)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1920)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>> ProtobufRpcEngine.java:229)
>>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:498)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>> od(RetryInvocationHandler.java:191)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>> ryInvocationHandler.java:102)
>>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>> llowingBlock(DFSOutputStream.java:1459)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>> kOutputStream(DFSOutputStream.java:1255)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>> DFSOutputStream.java:449)
>>
>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>> org.apache.solr.common.SolrException: Error logging add
>>         at org.apache.solr.update.TransactionLog.write(TransactionLog.
>> java:418)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>         at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> versionAdd(DistributedUpdateProcessor.java:1113)
>>         at org.apache.solr.update.processor.DistributedUpdateProcessor.
>> processAdd(DistributedUpdateProcessor.java:748)
>>         at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>         at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>> nLoader.java:98)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:306)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>         at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>> odec.java:271)
>>         at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>> c.java:251)
>>         at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>> dec.java:173)
>>         at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>         at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>> s(JavabinLoader.java:108)
>>         at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>> der.java:55)
>>         at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>> questHandler.java:97)
>>         at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>> stBody(ContentStreamHandlerBase.java:68)
>>         at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>> uestHandlerBase.java:173)
>>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>         at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>> java:723)
>>         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>> 529)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:361)
>>         at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>> atchFilter.java:305)
>>         at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>> r(ServletHandler.java:1691)
>>         at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>> dler.java:582)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:143)
>>         at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>> ndler.java:548)
>>         at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>> SessionHandler.java:226)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>> ContextHandler.java:1180)
>>         at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>> ler.java:512)
>>         at org.eclipse.jetty.server.session.SessionHandler.doScope(
>> SessionHandler.java:185)
>>         at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>> ContextHandler.java:1112)
>>         at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>> Handler.java:141)
>>         at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>> ndle(ContextHandlerCollection.java:213)
>>         at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>> HandlerCollection.java:119)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>> iteHandler.java:335)
>>         at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>> erWrapper.java:134)
>>         at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>> java:320)
>>         at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>> ction.java:251)
>>         at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>> succeeded(AbstractConnection.java:273)
>>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>> java:95)
>>         at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>> elEndPoint.java:93)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .produceConsume(ExecuteProduceConsume.java:148)
>>         at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>> .run(ExecuteProduceConsume.java:136)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>> ThreadPool.java:671)
>>         at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>> hreadPool.java:589)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>> hooseTarget4NewBlock(BlockManager.java:1622)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>> ionalBlock(FSNamesystem.java:3351)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>> Block(NameNodeRpcServer.java:683)
>>         at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>> tProtocol.java:214)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>> TranslatorPB.java:495)
>>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>> enodeProtocolProtos.java)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>> voker.call(ProtobufRpcEngine.java:617)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:422)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1920)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>> ProtobufRpcEngine.java:229)
>>         at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>         at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:498)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>> od(RetryInvocationHandler.java:191)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>> ryInvocationHandler.java:102)
>>         at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>> llowingBlock(DFSOutputStream.java:1459)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>> kOutputStream(DFSOutputStream.java:1255)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>> DFSOutputStream.java:449)
>>
>> 2017-07-17 12:29:24.187 INFO (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>> state:SyncConnected type:NodeDataChanged path:/collections/UNCLASS/state.json]
>> for collection [UNCLASS] has occurred - updating... (live nodes size: [45])
>>
>> On the client side, the error looks like:
>> 2017-07-16 19:03:16,118 WARN [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>> Indexing error: org.apache.solr.client.solrj.i
>> mpl.CloudSolrClient$RouteException: Error from server at
>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>> document id COLLECT10086453202 to the index; possible analysis error. for
>> collection: UNCLASS
>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>> writing document id COLLECT10086453202 to the index; possible analysis
>> error.
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>> te(CloudSolrClient.java:819)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>> t(CloudSolrClient.java:1263)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>> oudSolrClient.java:1073)
>>         at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>> .java:160)
>>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>> 106)
>>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>> 71)
>>         at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>> 85)
>>         at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>> IndexDocument.java:959)
>>         at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>> IndexDocument.java:236)
>>         at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>> olrIndexerProcessor.java:63)
>>         at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>> Processor.java:140)
>>         at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>> process(IntelEntQueueProc.java:208)
>>         at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>> egateSyncProcessor.java:63)
>>         at org.apache.camel.management.InstrumentationProcessor.process
>> (InstrumentationProcessor.java:77)
>>         at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>> deliveryErrorHandler.java:460)
>>         at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>> melInternalProcessor.java:190)
>>         at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>> melInternalProcessor.java:190)
>>         at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>> (SedaConsumer.java:298)
>>         at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>> mer.java:207)
>>         at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>> r.java:154)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>> Exception writing document id COLLECT10086453202 to the index; possible
>> analysis error.
>>         at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>> od(HttpSolrClient.java:610)
>>         at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>> pSolrClient.java:279)
>>         at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>> pSolrClient.java:268)
>>         at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>> (LBHttpSolrClient.java:447)
>>         at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>> BHttpSolrClient.java:388)
>>         at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>> ectUpdate$0(CloudSolrClient.java:796)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>         at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>         ... 3 more
>> 2017-07-16 19:03:16,134 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>> Error indexing: org.apache.solr.client.solrj.i
>> mpl.CloudSolrClient$RouteException: Error from server at
>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>> document id COLLECT10086453202 to the index; possible analysis error. for
>> collection: UNCLASS.
>> 2017-07-16 19:03:16,135 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>> Exception during indexing: org.apache.solr.client.solrj.i
>> mpl.CloudSolrClient$RouteException: Error from server at
>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>> document id COLLECT10086453202 to the index; possible analysis error.
>>
>> I can fire them back up, but they only run for a short time before
>> getting more indexing errors.  Several of the nodes show as down in the
>> cloud view.  Any help would be appreciated!  Thank you!
>>
>>
>> -Joe
>>
>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Joe Obernberger
So far we've indexed about 46 million documents, but over the weekend,
these errors started coming up.  I would expect that if there was a
basic issue, it would have started right away?  We ran a test cluster
with just a few shards/replicas prior and didn't see any issues using
the same indexing code, but we're running a lot more indexers
simultaneously with the larger cluster; perhaps we're just overloading
HDFS?  The same nodes that run Solr also run HDFS datanodes, but they
are pretty beefy machines; we're not swapping.

As Shawn pointed out, I will be checking the HDFS version (we're using
Cloudera CDH 5.10.2), and the HDFS logs.

-Joe


On 7/17/2017 10:16 AM, Susheel Kumar wrote:

> There is some analysis error also.  I would suggest to test the indexer on
> just one shard setup first, then test for a replica (1 shard and 1 replica)
> and then test for 2 shards and 2 replica.  This would confirm if there is
> basic issue with indexing / cluster setup.
>
> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
> [hidden email]> wrote:
>
>> Some more info:
>>
>> When I stop all the indexers, in about 5-10 minutes the cluster goes all
>> green.  When I start just one indexer, several nodes immediately go down
>> with the 'Error adding log' message.
>>
>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>> indexing.  Is this correct for SolrCloud?
>>
>> Thank you!
>>
>> -Joe
>>
>>
>>
>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>
>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>> replicas, but our indexing processes have been stopping due to errors.  On
>>> the server side the error is "Error logging add". Stack trace:
>>>
>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS s:shard58
>>> r:core_node290 x:UNCLASS_shard58_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0 171
>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS s:shard13
>>> r:core_node81 x:UNCLASS_shard13_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0 274
>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS s:shard43
>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.u.p.LogUpdateProcessorFactory
>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>> org.apache.solr.common.SolrException: Error logging add
>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>> java:418)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> processAdd(DistributedUpdateProcessor.java:748)
>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>> nLoader.java:98)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:306)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:271)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>> dec.java:173)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>> s(JavabinLoader.java:108)
>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>> der.java:55)
>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>> questHandler.java:97)
>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>> stBody(ContentStreamHandlerBase.java:68)
>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>> uestHandlerBase.java:173)
>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>> java:723)
>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>> 529)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:361)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:305)
>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>> r(ServletHandler.java:1691)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>> dler.java:582)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:143)
>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>> ndler.java:548)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>> SessionHandler.java:226)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>> ContextHandler.java:1180)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>> ler.java:512)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>> SessionHandler.java:185)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>> ContextHandler.java:1112)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:141)
>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>> ndle(ContextHandlerCollection.java:213)
>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>> HandlerCollection.java:119)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>> iteHandler.java:335)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>> java:320)
>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>> ction.java:251)
>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>> succeeded(AbstractConnection.java:273)
>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>> java:95)
>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>> elEndPoint.java:93)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .run(ExecuteProduceConsume.java:136)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>> ThreadPool.java:671)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>> hreadPool.java:589)
>>>          at java.lang.Thread.run(Thread.java:748)
>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>> ionalBlock(FSNamesystem.java:3351)
>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>> Block(NameNodeRpcServer.java:683)
>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>> tProtocol.java:214)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>> TranslatorPB.java:495)
>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>> enodeProtocolProtos.java)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>> voker.call(ProtobufRpcEngine.java:617)
>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1920)
>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>> ProtobufRpcEngine.java:229)
>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>> od(RetryInvocationHandler.java:191)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>> ryInvocationHandler.java:102)
>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>> llowingBlock(DFSOutputStream.java:1459)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>> kOutputStream(DFSOutputStream.java:1255)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>> DFSOutputStream.java:449)
>>>
>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS s:shard43
>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>> org.apache.solr.common.SolrException: Error logging add
>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>> java:418)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>> processAdd(DistributedUpdateProcessor.java:748)
>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>> nLoader.java:98)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:306)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>> odec.java:271)
>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>> c.java:251)
>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>> dec.java:173)
>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>> s(JavabinLoader.java:108)
>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>> der.java:55)
>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>> questHandler.java:97)
>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>> stBody(ContentStreamHandlerBase.java:68)
>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>> uestHandlerBase.java:173)
>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>> java:723)
>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>> 529)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:361)
>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>> atchFilter.java:305)
>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>> r(ServletHandler.java:1691)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>> dler.java:582)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:143)
>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>> ndler.java:548)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>> SessionHandler.java:226)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>> ContextHandler.java:1180)
>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>> ler.java:512)
>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>> SessionHandler.java:185)
>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>> ContextHandler.java:1112)
>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>> Handler.java:141)
>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>> ndle(ContextHandlerCollection.java:213)
>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>> HandlerCollection.java:119)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>> iteHandler.java:335)
>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>> erWrapper.java:134)
>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>> java:320)
>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>> ction.java:251)
>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>> succeeded(AbstractConnection.java:273)
>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>> java:95)
>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>> elEndPoint.java:93)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>> .run(ExecuteProduceConsume.java:136)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>> ThreadPool.java:671)
>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>> hreadPool.java:589)
>>>          at java.lang.Thread.run(Thread.java:748)
>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>>> are 40 datanode(s) running and no node(s) are excluded in this operation.
>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>> ionalBlock(FSNamesystem.java:3351)
>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>> Block(NameNodeRpcServer.java:683)
>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>> tProtocol.java:214)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>> TranslatorPB.java:495)
>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>> enodeProtocolProtos.java)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>> voker.call(ProtobufRpcEngine.java:617)
>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1920)
>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>> ProtobufRpcEngine.java:229)
>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>> od(RetryInvocationHandler.java:191)
>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>> ryInvocationHandler.java:102)
>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>> llowingBlock(DFSOutputStream.java:1459)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>> kOutputStream(DFSOutputStream.java:1255)
>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>> DFSOutputStream.java:449)
>>>
>>> 2017-07-17 12:29:24.187 INFO (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>> state:SyncConnected type:NodeDataChanged path:/collections/UNCLASS/state.json]
>>> for collection [UNCLASS] has occurred - updating... (live nodes size: [45])
>>>
>>> On the client side, the error looks like:
>>> 2017-07-16 19:03:16,118 WARN [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>> Indexing error: org.apache.solr.client.solrj.i
>>> mpl.CloudSolrClient$RouteException: Error from server at
>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>> document id COLLECT10086453202 to the index; possible analysis error. for
>>> collection: UNCLASS
>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>>> writing document id COLLECT10086453202 to the index; possible analysis
>>> error.
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>>> te(CloudSolrClient.java:819)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>>> t(CloudSolrClient.java:1263)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>>> oudSolrClient.java:1073)
>>>          at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>>> .java:160)
>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>> 106)
>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>> 71)
>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>> 85)
>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>>> IndexDocument.java:959)
>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>> IndexDocument.java:236)
>>>          at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>>> olrIndexerProcessor.java:63)
>>>          at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>> Processor.java:140)
>>>          at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>>> process(IntelEntQueueProc.java:208)
>>>          at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>>> egateSyncProcessor.java:63)
>>>          at org.apache.camel.management.InstrumentationProcessor.process
>>> (InstrumentationProcessor.java:77)
>>>          at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>>> deliveryErrorHandler.java:460)
>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>> melInternalProcessor.java:190)
>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>> melInternalProcessor.java:190)
>>>          at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>>> (SedaConsumer.java:298)
>>>          at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>>> mer.java:207)
>>>          at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>>> r.java:154)
>>>          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>>          at java.lang.Thread.run(Thread.java:748)
>>> Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>> Exception writing document id COLLECT10086453202 to the index; possible
>>> analysis error.
>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>>> od(HttpSolrClient.java:610)
>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>> pSolrClient.java:279)
>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>> pSolrClient.java:268)
>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>>> (LBHttpSolrClient.java:447)
>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>>> BHttpSolrClient.java:388)
>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>>> ectUpdate$0(CloudSolrClient.java:796)
>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>          at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>          ... 3 more
>>> 2017-07-16 19:03:16,134 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>> Error indexing: org.apache.solr.client.solrj.i
>>> mpl.CloudSolrClient$RouteException: Error from server at
>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>> document id COLLECT10086453202 to the index; possible analysis error. for
>>> collection: UNCLASS.
>>> 2017-07-16 19:03:16,135 ERROR [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>> Exception during indexing: org.apache.solr.client.solrj.i
>>> mpl.CloudSolrClient$RouteException: Error from server at
>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>
>>> I can fire them back up, but they only run for a short time before
>>> getting more indexing errors.  Several of the nodes show as down in the
>>> cloud view.  Any help would be appreciated!  Thank you!
>>>
>>>
>>> -Joe
>>>
>>>
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Erick Erickson
Joe:

I agree that 46 million docs later you'd expect things to have settled
out. However, I do note that you have
"add-unknown-fields-to-the-schema" in your error stack which means
you're using "field guessing", sometimes called data_driven. I would
recommend you do _not_ use this for production as, while it does the
best job it can it has to make assumptions about what the data looks
like based on the first document it sees which may later be violated.
Getting "possible analysis error" is one of the messages that happens
when this occurs.

The simple example is that if the first time data_driven sees "1"
it'll guess integer. If sometime later there's a doc with "1.0" it'll
generate a parse error.

I totally agree that 46 million docs later you'd expect all of this
kind of thing to have flushed out, but the "possible analysis error"
seems to be pointing that direction. If this is, indeed, the problem
you'll see better evidence on the Solr instance that's actually having
the problem. Unfortunately you'll just to look at one Solr log from
each shard to see whether this is an issue.

Best,
Erick

On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
<[hidden email]> wrote:

> So far we've indexed about 46 million documents, but over the weekend, these
> errors started coming up.  I would expect that if there was a basic issue,
> it would have started right away?  We ran a test cluster with just a few
> shards/replicas prior and didn't see any issues using the same indexing
> code, but we're running a lot more indexers simultaneously with the larger
> cluster; perhaps we're just overloading HDFS?  The same nodes that run Solr
> also run HDFS datanodes, but they are pretty beefy machines; we're not
> swapping.
>
> As Shawn pointed out, I will be checking the HDFS version (we're using
> Cloudera CDH 5.10.2), and the HDFS logs.
>
> -Joe
>
>
>
> On 7/17/2017 10:16 AM, Susheel Kumar wrote:
>>
>> There is some analysis error also.  I would suggest to test the indexer on
>> just one shard setup first, then test for a replica (1 shard and 1
>> replica)
>> and then test for 2 shards and 2 replica.  This would confirm if there is
>> basic issue with indexing / cluster setup.
>>
>> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
>> [hidden email]> wrote:
>>
>>> Some more info:
>>>
>>> When I stop all the indexers, in about 5-10 minutes the cluster goes all
>>> green.  When I start just one indexer, several nodes immediately go down
>>> with the 'Error adding log' message.
>>>
>>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>>> indexing.  Is this correct for SolrCloud?
>>>
>>> Thank you!
>>>
>>> -Joe
>>>
>>>
>>>
>>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>>
>>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>>> replicas, but our indexing processes have been stopping due to errors.
>>>> On
>>>> the server side the error is "Error logging add". Stack trace:
>>>>
>>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
>>>> s:shard58
>>>> r:core_node290 x:UNCLASS_shard58_replica1]
>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
>>>> 171
>>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
>>>> s:shard13
>>>> r:core_node81 x:UNCLASS_shard13_replica1]
>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
>>>> 274
>>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
>>>> s:shard43
>>>> r:core_node108 x:UNCLASS_shard43_replica1]
>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>> s:shard43
>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>>> org.apache.solr.common.SolrException: Error logging add
>>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>> java:418)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>> nLoader.java:98)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:306)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:271)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>> dec.java:173)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>> s(JavabinLoader.java:108)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>> der.java:55)
>>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>> questHandler.java:97)
>>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>> uestHandlerBase.java:173)
>>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>> java:723)
>>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>> 529)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:361)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:305)
>>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>> r(ServletHandler.java:1691)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>> dler.java:582)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:143)
>>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>> ndler.java:548)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>> SessionHandler.java:226)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>> ContextHandler.java:1180)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>> ler.java:512)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>> SessionHandler.java:185)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>> ContextHandler.java:1112)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:141)
>>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>> ndle(ContextHandlerCollection.java:213)
>>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>> HandlerCollection.java:119)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>> iteHandler.java:335)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>> java:320)
>>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>> ction.java:251)
>>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>> succeeded(AbstractConnection.java:273)
>>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>> java:95)
>>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>> elEndPoint.java:93)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .run(ExecuteProduceConsume.java:136)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>> ThreadPool.java:671)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>> hreadPool.java:589)
>>>>          at java.lang.Thread.run(Thread.java:748)
>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>> There
>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>> operation.
>>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>> ionalBlock(FSNamesystem.java:3351)
>>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>> Block(NameNodeRpcServer.java:683)
>>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>> tProtocol.java:214)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>> TranslatorPB.java:495)
>>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>> enodeProtocolProtos.java)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>> upInformation.java:1920)
>>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>> ProtobufRpcEngine.java:229)
>>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>> od(RetryInvocationHandler.java:191)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>> ryInvocationHandler.java:102)
>>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>> DFSOutputStream.java:449)
>>>>
>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>> s:shard43
>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>>> org.apache.solr.common.SolrException: Error logging add
>>>>          at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>> java:418)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>          at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>          at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>> nLoader.java:98)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:306)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>> odec.java:271)
>>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>> c.java:251)
>>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>> dec.java:173)
>>>>          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>> s(JavabinLoader.java:108)
>>>>          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>> der.java:55)
>>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>> questHandler.java:97)
>>>>          at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>          at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>> uestHandlerBase.java:173)
>>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>          at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>> java:723)
>>>>          at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>> 529)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:361)
>>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>> atchFilter.java:305)
>>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>> r(ServletHandler.java:1691)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>> dler.java:582)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:143)
>>>>          at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>> ndler.java:548)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>> SessionHandler.java:226)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>> ContextHandler.java:1180)
>>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>> ler.java:512)
>>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>> SessionHandler.java:185)
>>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>> ContextHandler.java:1112)
>>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>> Handler.java:141)
>>>>          at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>> ndle(ContextHandlerCollection.java:213)
>>>>          at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>> HandlerCollection.java:119)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>> iteHandler.java:335)
>>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>> erWrapper.java:134)
>>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>> java:320)
>>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>> ction.java:251)
>>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>> succeeded(AbstractConnection.java:273)
>>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>> java:95)
>>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>> elEndPoint.java:93)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>          at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>> .run(ExecuteProduceConsume.java:136)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>> ThreadPool.java:671)
>>>>          at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>> hreadPool.java:589)
>>>>          at java.lang.Thread.run(Thread.java:748)
>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>> There
>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>> operation.
>>>>          at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>> ionalBlock(FSNamesystem.java:3351)
>>>>          at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>> Block(NameNodeRpcServer.java:683)
>>>>          at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>> tProtocol.java:214)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>> TranslatorPB.java:495)
>>>>          at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>> enodeProtocolProtos.java)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>          at java.security.AccessController.doPrivileged(Native Method)
>>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>> upInformation.java:1920)
>>>>          at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>> ProtobufRpcEngine.java:229)
>>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>          at java.lang.reflect.Method.invoke(Method.java:498)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>> od(RetryInvocationHandler.java:191)
>>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>> ryInvocationHandler.java:102)
>>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>> DFSOutputStream.java:449)
>>>>
>>>> 2017-07-17 12:29:24.187 INFO
>>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>>> state:SyncConnected type:NodeDataChanged
>>>> path:/collections/UNCLASS/state.json]
>>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
>>>> [45])
>>>>
>>>> On the client side, the error looks like:
>>>> 2017-07-16 19:03:16,118 WARN
>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>> Indexing error: org.apache.solr.client.solrj.i
>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>> for
>>>> collection: UNCLASS
>>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>>>> writing document id COLLECT10086453202 to the index; possible analysis
>>>> error.
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>>>> te(CloudSolrClient.java:819)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>>>> t(CloudSolrClient.java:1263)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>>>> oudSolrClient.java:1073)
>>>>          at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>>>> .java:160)
>>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>> 106)
>>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>> 71)
>>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>> 85)
>>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>>>> IndexDocument.java:959)
>>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>>> IndexDocument.java:236)
>>>>          at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>>>> olrIndexerProcessor.java:63)
>>>>          at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>>> Processor.java:140)
>>>>          at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>>>> process(IntelEntQueueProc.java:208)
>>>>          at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>>>> egateSyncProcessor.java:63)
>>>>          at org.apache.camel.management.InstrumentationProcessor.process
>>>> (InstrumentationProcessor.java:77)
>>>>          at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>>>> deliveryErrorHandler.java:460)
>>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>> melInternalProcessor.java:190)
>>>>          at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>> melInternalProcessor.java:190)
>>>>          at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>>>> (SedaConsumer.java:298)
>>>>          at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>>>> mer.java:207)
>>>>          at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>>>> r.java:154)
>>>>          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>> Executor.java:1142)
>>>>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>> lExecutor.java:617)
>>>>          at java.lang.Thread.run(Thread.java:748)
>>>> Caused by:
>>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>>> Exception writing document id COLLECT10086453202 to the index; possible
>>>> analysis error.
>>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>>>> od(HttpSolrClient.java:610)
>>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>> pSolrClient.java:279)
>>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>> pSolrClient.java:268)
>>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>>>> (LBHttpSolrClient.java:447)
>>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>>>> BHttpSolrClient.java:388)
>>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>>>> ectUpdate$0(CloudSolrClient.java:796)
>>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>          at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>>          ... 3 more
>>>> 2017-07-16 19:03:16,134 ERROR
>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>> Error indexing: org.apache.solr.client.solrj.i
>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>> for
>>>> collection: UNCLASS.
>>>> 2017-07-16 19:03:16,135 ERROR
>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>> Exception during indexing: org.apache.solr.client.solrj.i
>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>
>>>> I can fire them back up, but they only run for a short time before
>>>> getting more indexing errors.  Several of the nodes show as down in the
>>>> cloud view.  Any help would be appreciated!  Thank you!
>>>>
>>>>
>>>> -Joe
>>>>
>>>>
>>
>> ---
>> This email has been checked for viruses by AVG.
>> http://www.avg.com
>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Susheel Kumar-3
and there is document id mentioned above when it failed with analysis
error.  You can look how those documents differ as Eric suggested.

On Mon, Jul 17, 2017 at 11:53 AM, Erick Erickson <[hidden email]>
wrote:

> Joe:
>
> I agree that 46 million docs later you'd expect things to have settled
> out. However, I do note that you have
> "add-unknown-fields-to-the-schema" in your error stack which means
> you're using "field guessing", sometimes called data_driven. I would
> recommend you do _not_ use this for production as, while it does the
> best job it can it has to make assumptions about what the data looks
> like based on the first document it sees which may later be violated.
> Getting "possible analysis error" is one of the messages that happens
> when this occurs.
>
> The simple example is that if the first time data_driven sees "1"
> it'll guess integer. If sometime later there's a doc with "1.0" it'll
> generate a parse error.
>
> I totally agree that 46 million docs later you'd expect all of this
> kind of thing to have flushed out, but the "possible analysis error"
> seems to be pointing that direction. If this is, indeed, the problem
> you'll see better evidence on the Solr instance that's actually having
> the problem. Unfortunately you'll just to look at one Solr log from
> each shard to see whether this is an issue.
>
> Best,
> Erick
>
> On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
> <[hidden email]> wrote:
> > So far we've indexed about 46 million documents, but over the weekend,
> these
> > errors started coming up.  I would expect that if there was a basic
> issue,
> > it would have started right away?  We ran a test cluster with just a few
> > shards/replicas prior and didn't see any issues using the same indexing
> > code, but we're running a lot more indexers simultaneously with the
> larger
> > cluster; perhaps we're just overloading HDFS?  The same nodes that run
> Solr
> > also run HDFS datanodes, but they are pretty beefy machines; we're not
> > swapping.
> >
> > As Shawn pointed out, I will be checking the HDFS version (we're using
> > Cloudera CDH 5.10.2), and the HDFS logs.
> >
> > -Joe
> >
> >
> >
> > On 7/17/2017 10:16 AM, Susheel Kumar wrote:
> >>
> >> There is some analysis error also.  I would suggest to test the indexer
> on
> >> just one shard setup first, then test for a replica (1 shard and 1
> >> replica)
> >> and then test for 2 shards and 2 replica.  This would confirm if there
> is
> >> basic issue with indexing / cluster setup.
> >>
> >> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
> >> [hidden email]> wrote:
> >>
> >>> Some more info:
> >>>
> >>> When I stop all the indexers, in about 5-10 minutes the cluster goes
> all
> >>> green.  When I start just one indexer, several nodes immediately go
> down
> >>> with the 'Error adding log' message.
> >>>
> >>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
> >>> indexing.  Is this correct for SolrCloud?
> >>>
> >>> Thank you!
> >>>
> >>> -Joe
> >>>
> >>>
> >>>
> >>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
> >>>
> >>>> We've been indexing data on a 45 node cluster with 100 shards and 3
> >>>> replicas, but our indexing processes have been stopping due to errors.
> >>>> On
> >>>> the server side the error is "Error logging add". Stack trace:
> >>>>
> >>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
> >>>> s:shard58
> >>>> r:core_node290 x:UNCLASS_shard58_replica1]
> >>>> o.a.s.u.p.LogUpdateProcessorFactory
> >>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
> >>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
> >>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
> >>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
> >>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
> >>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
> >>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
> >>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
> >>>> 171
> >>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
> >>>> s:shard13
> >>>> r:core_node81 x:UNCLASS_shard13_replica1]
> >>>> o.a.s.u.p.LogUpdateProcessorFactory
> >>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
> >>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
> >>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
> >>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
> >>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
> >>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
> >>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
> >>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
> >>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
> >>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
> >>>> 274
> >>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
> >>>> s:shard43
> >>>> r:core_node108 x:UNCLASS_shard43_replica1]
> >>>> o.a.s.u.p.LogUpdateProcessorFactory
> >>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
> >>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
> >>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
> >>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
> >>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
> >>>> s:shard43
> >>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
> >>>> org.apache.solr.common.SolrException: Error logging add
> >>>>          at org.apache.solr.update.TransactionLog.write(
> TransactionLog.
> >>>> java:418)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> versionAdd(DistributedUpdateProcessor.java:1113)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> processAdd(DistributedUpdateProcessor.java:748)
> >>>>          at org.apache.solr.update.processor.
> LogUpdateProcessorFactory$L
> >>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(
> Javabi
> >>>> nLoader.java:98)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:306)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:271)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(
> JavaBinCo
> >>>> dec.java:173)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.
> parseAndLoadDoc
> >>>> s(JavabinLoader.java:108)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.load(
> JavabinLoa
> >>>> der.java:55)
> >>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(
> UpdateRe
> >>>> questHandler.java:97)
> >>>>          at org.apache.solr.handler.ContentStreamHandlerBase.
> handleReque
> >>>> stBody(ContentStreamHandlerBase.java:68)
> >>>>          at org.apache.solr.handler.RequestHandlerBase.
> handleRequest(Req
> >>>> uestHandlerBase.java:173)
> >>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.execute(
> HttpSolrCall.
> >>>> java:723)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:
> >>>> 529)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:361)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:305)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilte
> >>>> r(ServletHandler.java:1691)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHan
> >>>> dler.java:582)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:143)
> >>>>          at org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHa
> >>>> ndler.java:548)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
> >>>> SessionHandler.java:226)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
> >>>> ContextHandler.java:1180)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHand
> >>>> ler.java:512)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
> >>>> SessionHandler.java:185)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
> >>>> ContextHandler.java:1112)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:141)
> >>>>          at org.eclipse.jetty.server.handler.
> ContextHandlerCollection.ha
> >>>> ndle(ContextHandlerCollection.java:213)
> >>>>          at org.eclipse.jetty.server.handler.HandlerCollection.
> handle(
> >>>> HandlerCollection.java:119)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> Rewr
> >>>> iteHandler.java:335)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
> >>>> java:320)
> >>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConne
> >>>> ction.java:251)
> >>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
> >>>> succeeded(AbstractConnection.java:273)
> >>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
> >>>> java:95)
> >>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChann
> >>>> elEndPoint.java:93)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .produceConsume(ExecuteProduceConsume.java:148)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .run(ExecuteProduceConsume.java:136)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool.runJob(Queued
> >>>> ThreadPool.java:671)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool$2.run(QueuedT
> >>>> hreadPool.java:589)
> >>>>          at java.lang.Thread.run(Thread.java:748)
> >>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
> IOException):
> >>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
> 0000000000000006211
> >>>> could only be replicated to 0 nodes instead of minReplication (=1).
> >>>> There
> >>>> are 40 datanode(s) running and no node(s) are excluded in this
> >>>> operation.
> >>>>          at org.apache.hadoop.hdfs.server.
> blockmanagement.BlockManager.c
> >>>> hooseTarget4NewBlock(BlockManager.java:1622)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.FSNamesystem.getAddit
> >>>> ionalBlock(FSNamesystem.java:3351)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.NameNodeRpcServer.add
> >>>> Block(NameNodeRpcServer.java:683)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.AuthorizationProvider
> >>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
> >>>> tProtocol.java:214)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolServ
> >>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
> >>>> TranslatorPB.java:495)
> >>>>          at org.apache.hadoop.hdfs.protocol.proto.
> ClientNamenodeProtocol
> >>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
> >>>> enodeProtocolProtos.java)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
> ProtoBufRpcIn
> >>>> voker.call(ProtobufRpcEngine.java:617)
> >>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2216)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2212)
> >>>>          at java.security.AccessController.doPrivileged(Native
> Method)
> >>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
> >>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGro
> >>>> upInformation.java:1920)
> >>>>          at org.apache.hadoop.ipc.Server$
> Handler.run(Server.java:2210)
> >>>>
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
> >>>> ProtobufRpcEngine.java:229)
> >>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolTran
> >>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
> >>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
> Source)
> >>>>          at sun.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMe
> >>>> thodAccessorImpl.java:43)
> >>>>          at java.lang.reflect.Method.invoke(Method.java:498)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.
> invokeMeth
> >>>> od(RetryInvocationHandler.java:191)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
> Ret
> >>>> ryInvocationHandler.java:102)
> >>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> locateFo
> >>>> llowingBlock(DFSOutputStream.java:1459)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> nextBloc
> >>>> kOutputStream(DFSOutputStream.java:1255)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
> >>>> DFSOutputStream.java:449)
> >>>>
> >>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
> >>>> s:shard43
> >>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
> >>>> org.apache.solr.common.SolrException: Error logging add
> >>>>          at org.apache.solr.update.TransactionLog.write(
> TransactionLog.
> >>>> java:418)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
> >>>>          at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> versionAdd(DistributedUpdateProcessor.java:1113)
> >>>>          at org.apache.solr.update.processor.
> DistributedUpdateProcessor.
> >>>> processAdd(DistributedUpdateProcessor.java:748)
> >>>>          at org.apache.solr.update.processor.
> LogUpdateProcessorFactory$L
> >>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader$1.update(
> Javabi
> >>>> nLoader.java:98)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:306)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readObject(
> JavaBinC
> >>>> odec.java:271)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.readVal(
> JavaBinCode
> >>>> c.java:251)
> >>>>          at org.apache.solr.common.util.JavaBinCodec.unmarshal(
> JavaBinCo
> >>>> dec.java:173)
> >>>>          at org.apache.solr.client.solrj.request.
> JavaBinUpdateRequestCod
> >>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.
> parseAndLoadDoc
> >>>> s(JavabinLoader.java:108)
> >>>>          at org.apache.solr.handler.loader.JavabinLoader.load(
> JavabinLoa
> >>>> der.java:55)
> >>>>          at org.apache.solr.handler.UpdateRequestHandler$1.load(
> UpdateRe
> >>>> questHandler.java:97)
> >>>>          at org.apache.solr.handler.ContentStreamHandlerBase.
> handleReque
> >>>> stBody(ContentStreamHandlerBase.java:68)
> >>>>          at org.apache.solr.handler.RequestHandlerBase.
> handleRequest(Req
> >>>> uestHandlerBase.java:173)
> >>>>          at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.execute(
> HttpSolrCall.
> >>>> java:723)
> >>>>          at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:
> >>>> 529)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:361)
> >>>>          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDisp
> >>>> atchFilter.java:305)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilte
> >>>> r(ServletHandler.java:1691)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHan
> >>>> dler.java:582)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:143)
> >>>>          at org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHa
> >>>> ndler.java:548)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doHandle(
> >>>> SessionHandler.java:226)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
> >>>> ContextHandler.java:1180)
> >>>>          at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHand
> >>>> ler.java:512)
> >>>>          at org.eclipse.jetty.server.session.SessionHandler.doScope(
> >>>> SessionHandler.java:185)
> >>>>          at org.eclipse.jetty.server.handler.ContextHandler.doScope(
> >>>> ContextHandler.java:1112)
> >>>>          at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> Scoped
> >>>> Handler.java:141)
> >>>>          at org.eclipse.jetty.server.handler.
> ContextHandlerCollection.ha
> >>>> ndle(ContextHandlerCollection.java:213)
> >>>>          at org.eclipse.jetty.server.handler.HandlerCollection.
> handle(
> >>>> HandlerCollection.java:119)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
> Rewr
> >>>> iteHandler.java:335)
> >>>>          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> Handl
> >>>> erWrapper.java:134)
> >>>>          at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >>>>          at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
> >>>> java:320)
> >>>>          at org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConne
> >>>> ction.java:251)
> >>>>          at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
> >>>> succeeded(AbstractConnection.java:273)
> >>>>          at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
> >>>> java:95)
> >>>>          at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChann
> >>>> elEndPoint.java:93)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .produceConsume(ExecuteProduceConsume.java:148)
> >>>>          at org.eclipse.jetty.util.thread.
> strategy.ExecuteProduceConsume
> >>>> .run(ExecuteProduceConsume.java:136)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool.runJob(Queued
> >>>> ThreadPool.java:671)
> >>>>          at org.eclipse.jetty.util.thread.
> QueuedThreadPool$2.run(QueuedT
> >>>> hreadPool.java:589)
> >>>>          at java.lang.Thread.run(Thread.java:748)
> >>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
> IOException):
> >>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
> 0000000000000006211
> >>>> could only be replicated to 0 nodes instead of minReplication (=1).
> >>>> There
> >>>> are 40 datanode(s) running and no node(s) are excluded in this
> >>>> operation.
> >>>>          at org.apache.hadoop.hdfs.server.
> blockmanagement.BlockManager.c
> >>>> hooseTarget4NewBlock(BlockManager.java:1622)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.FSNamesystem.getAddit
> >>>> ionalBlock(FSNamesystem.java:3351)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.NameNodeRpcServer.add
> >>>> Block(NameNodeRpcServer.java:683)
> >>>>          at org.apache.hadoop.hdfs.server.
> namenode.AuthorizationProvider
> >>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
> >>>> tProtocol.java:214)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolServ
> >>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
> >>>> TranslatorPB.java:495)
> >>>>          at org.apache.hadoop.hdfs.protocol.proto.
> ClientNamenodeProtocol
> >>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
> >>>> enodeProtocolProtos.java)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
> ProtoBufRpcIn
> >>>> voker.call(ProtobufRpcEngine.java:617)
> >>>>          at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2216)
> >>>>          at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
> 2212)
> >>>>          at java.security.AccessController.doPrivileged(Native
> Method)
> >>>>          at javax.security.auth.Subject.doAs(Subject.java:422)
> >>>>          at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGro
> >>>> upInformation.java:1920)
> >>>>          at org.apache.hadoop.ipc.Server$
> Handler.run(Server.java:2210)
> >>>>
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> >>>>          at org.apache.hadoop.ipc.Client.call(Client.java:1412)
> >>>>          at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
> >>>> ProtobufRpcEngine.java:229)
> >>>>          at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.protocolPB.
> ClientNamenodeProtocolTran
> >>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
> >>>>          at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
> Source)
> >>>>          at sun.reflect.DelegatingMethodAccessorImpl.
> invoke(DelegatingMe
> >>>> thodAccessorImpl.java:43)
> >>>>          at java.lang.reflect.Method.invoke(Method.java:498)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.
> invokeMeth
> >>>> od(RetryInvocationHandler.java:191)
> >>>>          at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
> Ret
> >>>> ryInvocationHandler.java:102)
> >>>>          at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> locateFo
> >>>> llowingBlock(DFSOutputStream.java:1459)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
> nextBloc
> >>>> kOutputStream(DFSOutputStream.java:1255)
> >>>>          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
> >>>> DFSOutputStream.java:449)
> >>>>
> >>>> 2017-07-17 12:29:24.187 INFO
> >>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
> >>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
> >>>> state:SyncConnected type:NodeDataChanged
> >>>> path:/collections/UNCLASS/state.json]
> >>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
> >>>> [45])
> >>>>
> >>>> On the client side, the error looks like:
> >>>> 2017-07-16 19:03:16,118 WARN
> >>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
> >>>> Indexing error: org.apache.solr.client.solrj.i
> >>>> mpl.CloudSolrClient$RouteException: Error from server at
> >>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
> >>>> document id COLLECT10086453202 to the index; possible analysis error.
> >>>> for
> >>>> collection: UNCLASS
> >>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
> Error
> >>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
> Exception
> >>>> writing document id COLLECT10086453202 to the index; possible analysis
> >>>> error.
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.
> directUpda
> >>>> te(CloudSolrClient.java:819)
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.
> sendReques
> >>>> t(CloudSolrClient.java:1263)
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.
> requestWit
> >>>> hRetryOnStaleState(CloudSolrClient.java:1134)
> >>>>          at org.apache.solr.client.solrj.
> impl.CloudSolrClient.request(Cl
> >>>> oudSolrClient.java:1073)
> >>>>          at org.apache.solr.client.solrj.SolrRequest.process(
> SolrRequest
> >>>> .java:160)
> >>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:
> >>>> 106)
> >>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:
> >>>> 71)
> >>>>          at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
> java:
> >>>> 85)
> >>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.
> indexSolrDocs(
> >>>> IndexDocument.java:959)
> >>>>          at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
> >>>> IndexDocument.java:236)
> >>>>          at com.ngc.bigdata.ie_solrindexer.
> SolrIndexerProcessor.doWork(S
> >>>> olrIndexerProcessor.java:63)
> >>>>          at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
> >>>> Processor.java:140)
> >>>>          at com.ngc.intelenterprise.intelentutil.jms.
> IntelEntQueueProc.
> >>>> process(IntelEntQueueProc.java:208)
> >>>>          at org.apache.camel.processor.DelegateSyncProcessor.process(
> Del
> >>>> egateSyncProcessor.java:63)
> >>>>          at org.apache.camel.management.InstrumentationProcessor.
> process
> >>>> (InstrumentationProcessor.java:77)
> >>>>          at org.apache.camel.processor.RedeliveryErrorHandler.
> process(Re
> >>>> deliveryErrorHandler.java:460)
> >>>>          at org.apache.camel.processor.CamelInternalProcessor.
> process(Ca
> >>>> melInternalProcessor.java:190)
> >>>>          at org.apache.camel.processor.CamelInternalProcessor.
> process(Ca
> >>>> melInternalProcessor.java:190)
> >>>>          at org.apache.camel.component.seda.SedaConsumer.
> sendToConsumers
> >>>> (SedaConsumer.java:298)
> >>>>          at org.apache.camel.component.seda.SedaConsumer.doRun(
> SedaConsu
> >>>> mer.java:207)
> >>>>          at org.apache.camel.component.seda.SedaConsumer.run(
> SedaConsume
> >>>> r.java:154)
> >>>>          at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPool
> >>>> Executor.java:1142)
> >>>>          at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoo
> >>>> lExecutor.java:617)
> >>>>          at java.lang.Thread.run(Thread.java:748)
> >>>> Caused by:
> >>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> >>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
> >>>> Exception writing document id COLLECT10086453202 to the index;
> possible
> >>>> analysis error.
> >>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.
> executeMeth
> >>>> od(HttpSolrClient.java:610)
> >>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> Htt
> >>>> pSolrClient.java:279)
> >>>>          at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
> Htt
> >>>> pSolrClient.java:268)
> >>>>          at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> doRequest
> >>>> (LBHttpSolrClient.java:447)
> >>>>          at org.apache.solr.client.solrj.
> impl.LBHttpSolrClient.request(L
> >>>> BHttpSolrClient.java:388)
> >>>>          at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$
> dir
> >>>> ectUpdate$0(CloudSolrClient.java:796)
> >>>>          at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> >>>>          at org.apache.solr.common.util.ExecutorUtil$
> MDCAwareThreadPoolE
> >>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
> >>>>          ... 3 more
> >>>> 2017-07-16 19:03:16,134 ERROR
> >>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
> >>>> Error indexing: org.apache.solr.client.solrj.i
> >>>> mpl.CloudSolrClient$RouteException: Error from server at
> >>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
> >>>> document id COLLECT10086453202 to the index; possible analysis error.
> >>>> for
> >>>> collection: UNCLASS.
> >>>> 2017-07-16 19:03:16,135 ERROR
> >>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
> >>>> Exception during indexing: org.apache.solr.client.solrj.i
> >>>> mpl.CloudSolrClient$RouteException: Error from server at
> >>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
> >>>> document id COLLECT10086453202 to the index; possible analysis error.
> >>>>
> >>>> I can fire them back up, but they only run for a short time before
> >>>> getting more indexing errors.  Several of the nodes show as down in
> the
> >>>> cloud view.  Any help would be appreciated!  Thank you!
> >>>>
> >>>>
> >>>> -Joe
> >>>>
> >>>>
> >>
> >> ---
> >> This email has been checked for viruses by AVG.
> >> http://www.avg.com
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Joe Obernberger
In reply to this post by Erick Erickson
Erick - thank you.  I meant to disable field guessing as our indexer
does this internally.  Thanks for seeing that!  Yes, we've seen things
come in like IDs that are 12345 (int), but then next ID is 12AF456 (string).

There is also a version mismatch between our Cloudera 5.10.2 hadoop
version and the version shipped with 6.6.0; correcting that.
Thanks again!

-Joe


On 7/17/2017 11:53 AM, Erick Erickson wrote:

> Joe:
>
> I agree that 46 million docs later you'd expect things to have settled
> out. However, I do note that you have
> "add-unknown-fields-to-the-schema" in your error stack which means
> you're using "field guessing", sometimes called data_driven. I would
> recommend you do _not_ use this for production as, while it does the
> best job it can it has to make assumptions about what the data looks
> like based on the first document it sees which may later be violated.
> Getting "possible analysis error" is one of the messages that happens
> when this occurs.
>
> The simple example is that if the first time data_driven sees "1"
> it'll guess integer. If sometime later there's a doc with "1.0" it'll
> generate a parse error.
>
> I totally agree that 46 million docs later you'd expect all of this
> kind of thing to have flushed out, but the "possible analysis error"
> seems to be pointing that direction. If this is, indeed, the problem
> you'll see better evidence on the Solr instance that's actually having
> the problem. Unfortunately you'll just to look at one Solr log from
> each shard to see whether this is an issue.
>
> Best,
> Erick
>
> On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
> <[hidden email]> wrote:
>> So far we've indexed about 46 million documents, but over the weekend, these
>> errors started coming up.  I would expect that if there was a basic issue,
>> it would have started right away?  We ran a test cluster with just a few
>> shards/replicas prior and didn't see any issues using the same indexing
>> code, but we're running a lot more indexers simultaneously with the larger
>> cluster; perhaps we're just overloading HDFS?  The same nodes that run Solr
>> also run HDFS datanodes, but they are pretty beefy machines; we're not
>> swapping.
>>
>> As Shawn pointed out, I will be checking the HDFS version (we're using
>> Cloudera CDH 5.10.2), and the HDFS logs.
>>
>> -Joe
>>
>>
>>
>> On 7/17/2017 10:16 AM, Susheel Kumar wrote:
>>> There is some analysis error also.  I would suggest to test the indexer on
>>> just one shard setup first, then test for a replica (1 shard and 1
>>> replica)
>>> and then test for 2 shards and 2 replica.  This would confirm if there is
>>> basic issue with indexing / cluster setup.
>>>
>>> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
>>> [hidden email]> wrote:
>>>
>>>> Some more info:
>>>>
>>>> When I stop all the indexers, in about 5-10 minutes the cluster goes all
>>>> green.  When I start just one indexer, several nodes immediately go down
>>>> with the 'Error adding log' message.
>>>>
>>>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>>>> indexing.  Is this correct for SolrCloud?
>>>>
>>>> Thank you!
>>>>
>>>> -Joe
>>>>
>>>>
>>>>
>>>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>>>
>>>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>>>> replicas, but our indexing processes have been stopping due to errors.
>>>>> On
>>>>> the server side the error is "Error logging add". Stack trace:
>>>>>
>>>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
>>>>> s:shard58
>>>>> r:core_node290 x:UNCLASS_shard58_replica1]
>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
>>>>> 171
>>>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
>>>>> s:shard13
>>>>> r:core_node81 x:UNCLASS_shard13_replica1]
>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
>>>>> 274
>>>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
>>>>> s:shard43
>>>>> r:core_node108 x:UNCLASS_shard43_replica1]
>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>> s:shard43
>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>           at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>>> java:418)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>           at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>>> nLoader.java:98)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:306)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:271)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>>> dec.java:173)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>>> s(JavabinLoader.java:108)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>>> der.java:55)
>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>>> questHandler.java:97)
>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>           at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>>> uestHandlerBase.java:173)
>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>>> java:723)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>>> 529)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:361)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:305)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>>> r(ServletHandler.java:1691)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>>> dler.java:582)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:143)
>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>>> ndler.java:548)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>> SessionHandler.java:226)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>> ContextHandler.java:1180)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>>> ler.java:512)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>> SessionHandler.java:185)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>> ContextHandler.java:1112)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:141)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>>> HandlerCollection.java:119)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>>> iteHandler.java:335)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>> java:320)
>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>>> ction.java:251)
>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>> succeeded(AbstractConnection.java:273)
>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>> java:95)
>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>>> elEndPoint.java:93)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>>> ThreadPool.java:671)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>>> hreadPool.java:589)
>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>> There
>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>> operation.
>>>>>           at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>>> Block(NameNodeRpcServer.java:683)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>> tProtocol.java:214)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>> TranslatorPB.java:495)
>>>>>           at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>> enodeProtocolProtos.java)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>>           at java.security.AccessController.doPrivileged(Native Method)
>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>> upInformation.java:1920)
>>>>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>>
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>> ProtobufRpcEngine.java:229)
>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>>> thodAccessorImpl.java:43)
>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>>> od(RetryInvocationHandler.java:191)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>>> ryInvocationHandler.java:102)
>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>> DFSOutputStream.java:449)
>>>>>
>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>> s:shard43
>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>           at org.apache.solr.update.TransactionLog.write(TransactionLog.
>>>>> java:418)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>           at org.apache.solr.update.processor.DistributedUpdateProcessor.
>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>           at org.apache.solr.update.processor.LogUpdateProcessorFactory$L
>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(Javabi
>>>>> nLoader.java:98)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:306)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinC
>>>>> odec.java:271)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCode
>>>>> c.java:251)
>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCo
>>>>> dec.java:173)
>>>>>           at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCod
>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDoc
>>>>> s(JavabinLoader.java:108)
>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoa
>>>>> der.java:55)
>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRe
>>>>> questHandler.java:97)
>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.handleReque
>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>           at org.apache.solr.handler.RequestHandlerBase.handleRequest(Req
>>>>> uestHandlerBase.java:173)
>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.
>>>>> java:723)
>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:
>>>>> 529)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:361)
>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDisp
>>>>> atchFilter.java:305)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilte
>>>>> r(ServletHandler.java:1691)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHan
>>>>> dler.java:582)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:143)
>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHa
>>>>> ndler.java:548)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>> SessionHandler.java:226)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>> ContextHandler.java:1180)
>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHand
>>>>> ler.java:512)
>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>> SessionHandler.java:185)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>> ContextHandler.java:1112)
>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(Scoped
>>>>> Handler.java:141)
>>>>>           at org.eclipse.jetty.server.handler.ContextHandlerCollection.ha
>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>>>>> HandlerCollection.java:119)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(Rewr
>>>>> iteHandler.java:335)
>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Handl
>>>>> erWrapper.java:134)
>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>> java:320)
>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConne
>>>>> ction.java:251)
>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>> succeeded(AbstractConnection.java:273)
>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>> java:95)
>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChann
>>>>> elEndPoint.java:93)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>           at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume
>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued
>>>>> ThreadPool.java:671)
>>>>>           at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedT
>>>>> hreadPool.java:589)
>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException):
>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.0000000000000006211
>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>> There
>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>> operation.
>>>>>           at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.c
>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAddit
>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.add
>>>>> Block(NameNodeRpcServer.java:683)
>>>>>           at org.apache.hadoop.hdfs.server.namenode.AuthorizationProvider
>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>> tProtocol.java:214)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServ
>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>> TranslatorPB.java:495)
>>>>>           at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocol
>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>> enodeProtocolProtos.java)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcIn
>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2212)
>>>>>           at java.security.AccessController.doPrivileged(Native Method)
>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>> upInformation.java:1920)
>>>>>           at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2210)
>>>>>
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>> ProtobufRpcEngine.java:229)
>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTran
>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>>> thodAccessorImpl.java:43)
>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMeth
>>>>> od(RetryInvocationHandler.java:191)
>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Ret
>>>>> ryInvocationHandler.java:102)
>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFo
>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBloc
>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>> DFSOutputStream.java:449)
>>>>>
>>>>> 2017-07-17 12:29:24.187 INFO
>>>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>>>> state:SyncConnected type:NodeDataChanged
>>>>> path:/collections/UNCLASS/state.json]
>>>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
>>>>> [45])
>>>>>
>>>>> On the client side, the error looks like:
>>>>> 2017-07-16 19:03:16,118 WARN
>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>> Indexing error: org.apache.solr.client.solrj.i
>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>> for
>>>>> collection: UNCLASS
>>>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error
>>>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3: Exception
>>>>> writing document id COLLECT10086453202 to the index; possible analysis
>>>>> error.
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpda
>>>>> te(CloudSolrClient.java:819)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.sendReques
>>>>> t(CloudSolrClient.java:1263)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWit
>>>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.request(Cl
>>>>> oudSolrClient.java:1073)
>>>>>           at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest
>>>>> .java:160)
>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>>> 106)
>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>>> 71)
>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:
>>>>> 85)
>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.indexSolrDocs(
>>>>> IndexDocument.java:959)
>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>>>> IndexDocument.java:236)
>>>>>           at com.ngc.bigdata.ie_solrindexer.SolrIndexerProcessor.doWork(S
>>>>> olrIndexerProcessor.java:63)
>>>>>           at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>>>> Processor.java:140)
>>>>>           at com.ngc.intelenterprise.intelentutil.jms.IntelEntQueueProc.
>>>>> process(IntelEntQueueProc.java:208)
>>>>>           at org.apache.camel.processor.DelegateSyncProcessor.process(Del
>>>>> egateSyncProcessor.java:63)
>>>>>           at org.apache.camel.management.InstrumentationProcessor.process
>>>>> (InstrumentationProcessor.java:77)
>>>>>           at org.apache.camel.processor.RedeliveryErrorHandler.process(Re
>>>>> deliveryErrorHandler.java:460)
>>>>>           at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>>> melInternalProcessor.java:190)
>>>>>           at org.apache.camel.processor.CamelInternalProcessor.process(Ca
>>>>> melInternalProcessor.java:190)
>>>>>           at org.apache.camel.component.seda.SedaConsumer.sendToConsumers
>>>>> (SedaConsumer.java:298)
>>>>>           at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsu
>>>>> mer.java:207)
>>>>>           at org.apache.camel.component.seda.SedaConsumer.run(SedaConsume
>>>>> r.java:154)
>>>>>           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>>>> Executor.java:1142)
>>>>>           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>>>> lExecutor.java:617)
>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>> Caused by:
>>>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>>>> Exception writing document id COLLECT10086453202 to the index; possible
>>>>> analysis error.
>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMeth
>>>>> od(HttpSolrClient.java:610)
>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>>> pSolrClient.java:279)
>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(Htt
>>>>> pSolrClient.java:268)
>>>>>           at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest
>>>>> (LBHttpSolrClient.java:447)
>>>>>           at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(L
>>>>> BHttpSolrClient.java:388)
>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$dir
>>>>> ectUpdate$0(CloudSolrClient.java:796)
>>>>>           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>           at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE
>>>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>>>           ... 3 more
>>>>> 2017-07-16 19:03:16,134 ERROR
>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>> Error indexing: org.apache.solr.client.solrj.i
>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>> for
>>>>> collection: UNCLASS.
>>>>> 2017-07-16 19:03:16,135 ERROR
>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>> Exception during indexing: org.apache.solr.client.solrj.i
>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>
>>>>> I can fire them back up, but they only run for a short time before
>>>>> getting more indexing errors.  Several of the nodes show as down in the
>>>>> cloud view.  Any help would be appreciated!  Thank you!
>>>>>
>>>>>
>>>>> -Joe
>>>>>
>>>>>
>>> ---
>>> This email has been checked for viruses by AVG.
>>> http://www.avg.com
>>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Joe Obernberger
In reply to this post by Susheel Kumar-3
We use puppet to deploy the solr instance to all the nodes.  I changed
what was deployed to use the CDH jars, but our puppet module deletes the
old directory and replaces it.  So, all the core configuration files
under server/solr/ were removed. Zookeeper still has the configuration,
but the nodes won't come up.

Is there a way around this?  Re-creating these files manually isn't
realistic; do I need to re-index?

-Joe


On 7/17/2017 12:07 PM, Susheel Kumar wrote:

> and there is document id mentioned above when it failed with analysis
> error.  You can look how those documents differ as Eric suggested.
>
> On Mon, Jul 17, 2017 at 11:53 AM, Erick Erickson <[hidden email]>
> wrote:
>
>> Joe:
>>
>> I agree that 46 million docs later you'd expect things to have settled
>> out. However, I do note that you have
>> "add-unknown-fields-to-the-schema" in your error stack which means
>> you're using "field guessing", sometimes called data_driven. I would
>> recommend you do _not_ use this for production as, while it does the
>> best job it can it has to make assumptions about what the data looks
>> like based on the first document it sees which may later be violated.
>> Getting "possible analysis error" is one of the messages that happens
>> when this occurs.
>>
>> The simple example is that if the first time data_driven sees "1"
>> it'll guess integer. If sometime later there's a doc with "1.0" it'll
>> generate a parse error.
>>
>> I totally agree that 46 million docs later you'd expect all of this
>> kind of thing to have flushed out, but the "possible analysis error"
>> seems to be pointing that direction. If this is, indeed, the problem
>> you'll see better evidence on the Solr instance that's actually having
>> the problem. Unfortunately you'll just to look at one Solr log from
>> each shard to see whether this is an issue.
>>
>> Best,
>> Erick
>>
>> On Mon, Jul 17, 2017 at 7:23 AM, Joe Obernberger
>> <[hidden email]> wrote:
>>> So far we've indexed about 46 million documents, but over the weekend,
>> these
>>> errors started coming up.  I would expect that if there was a basic
>> issue,
>>> it would have started right away?  We ran a test cluster with just a few
>>> shards/replicas prior and didn't see any issues using the same indexing
>>> code, but we're running a lot more indexers simultaneously with the
>> larger
>>> cluster; perhaps we're just overloading HDFS?  The same nodes that run
>> Solr
>>> also run HDFS datanodes, but they are pretty beefy machines; we're not
>>> swapping.
>>>
>>> As Shawn pointed out, I will be checking the HDFS version (we're using
>>> Cloudera CDH 5.10.2), and the HDFS logs.
>>>
>>> -Joe
>>>
>>>
>>>
>>> On 7/17/2017 10:16 AM, Susheel Kumar wrote:
>>>> There is some analysis error also.  I would suggest to test the indexer
>> on
>>>> just one shard setup first, then test for a replica (1 shard and 1
>>>> replica)
>>>> and then test for 2 shards and 2 replica.  This would confirm if there
>> is
>>>> basic issue with indexing / cluster setup.
>>>>
>>>> On Mon, Jul 17, 2017 at 9:04 AM, Joe Obernberger <
>>>> [hidden email]> wrote:
>>>>
>>>>> Some more info:
>>>>>
>>>>> When I stop all the indexers, in about 5-10 minutes the cluster goes
>> all
>>>>> green.  When I start just one indexer, several nodes immediately go
>> down
>>>>> with the 'Error adding log' message.
>>>>>
>>>>> I'm using CloudSolrClient.add(List<SolrInputDocument>) to do the
>>>>> indexing.  Is this correct for SolrCloud?
>>>>>
>>>>> Thank you!
>>>>>
>>>>> -Joe
>>>>>
>>>>>
>>>>>
>>>>> On 7/17/2017 8:36 AM, Joe Obernberger wrote:
>>>>>
>>>>>> We've been indexing data on a 45 node cluster with 100 shards and 3
>>>>>> replicas, but our indexing processes have been stopping due to errors.
>>>>>> On
>>>>>> the server side the error is "Error logging add". Stack trace:
>>>>>>
>>>>>> 2017-07-17 12:29:24.057 INFO  (qtp985934102-5161548) [c:UNCLASS
>>>>>> s:shard58
>>>>>> r:core_node290 x:UNCLASS_shard58_replica1]
>>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>>> [UNCLASS_shard58_replica1] webapp=/solr path=/update
>>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>>> UNCLASS_shard58_replica2/&wt=javabin&version=2}{add=[
>>>>>> COLLECT20003218348784 (1573172872544780288), COLLECT20003218351447
>>>>>> (1573172872620277760), COLLECT20003218353085 (1573172872625520640),
>>>>>> COLLECT20003218357937 (1573172872627617792), COLLECT20003218361860
>>>>>> (1573172872629714944), COLLECT20003218362535 (1573172872631812096)]} 0
>>>>>> 171
>>>>>> 2017-07-17 12:29:24.160 INFO  (qtp985934102-5160762) [c:UNCLASS
>>>>>> s:shard13
>>>>>> r:core_node81 x:UNCLASS_shard13_replica1]
>>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>>> [UNCLASS_shard13_replica1] webapp=/solr path=/update
>>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>>> UNCLASS_shard13_replica2/&wt=javabin&version=2}{add=[
>>>>>> COLLECT20003218344436 (1573172872538488832), COLLECT20003218347497
>>>>>> (1573172872620277760), COLLECT20003218351645 (1573172872625520640),
>>>>>> COLLECT20003218356965 (1573172872629714944), COLLECT20003218357775
>>>>>> (1573172872632860672), COLLECT20003218358017 (1573172872646492160),
>>>>>> COLLECT20003218358152 (1573172872650686464), COLLECT20003218359395
>>>>>> (1573172872651735040), COLLECT20003218362571 (1573172872652783616)]} 0
>>>>>> 274
>>>>>> 2017-07-17 12:29:24.163 INFO  (qtp985934102-5161057) [c:UNCLASS
>>>>>> s:shard43
>>>>>> r:core_node108 x:UNCLASS_shard43_replica1]
>>>>>> o.a.s.u.p.LogUpdateProcessorFactory
>>>>>> [UNCLASS_shard43_replica1] webapp=/solr path=/update
>>>>>> params={update.distrib=FROMLEADER&update.chain=add-unknown-
>>>>>> fields-to-the-schema&distrib.from=http://tarvos:9100/solr/
>>>>>> UNCLASS_shard43_replica2/&wt=javabin&version=2}{} 0 0
>>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>>> s:shard43
>>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.h.RequestHandlerBase
>>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>>           at org.apache.solr.update.TransactionLog.write(
>> TransactionLog.
>>>>>> java:418)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>>           at org.apache.solr.update.processor.
>> LogUpdateProcessorFactory$L
>>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(
>> Javabi
>>>>>> nLoader.java:98)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:306)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:271)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(
>> JavaBinCo
>>>>>> dec.java:173)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.
>> parseAndLoadDoc
>>>>>> s(JavabinLoader.java:108)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(
>> JavabinLoa
>>>>>> der.java:55)
>>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(
>> UpdateRe
>>>>>> questHandler.java:97)
>>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.
>> handleReque
>>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>>           at org.apache.solr.handler.RequestHandlerBase.
>> handleRequest(Req
>>>>>> uestHandlerBase.java:173)
>>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(
>> HttpSolrCall.
>>>>>> java:723)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(
>> HttpSolrCall.java:
>>>>>> 529)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:361)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:305)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
>> doFilte
>>>>>> r(ServletHandler.java:1691)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(
>> ServletHan
>>>>>> dler.java:582)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:143)
>>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(
>> SecurityHa
>>>>>> ndler.java:548)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>>> SessionHandler.java:226)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>>> ContextHandler.java:1180)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(
>> ServletHand
>>>>>> ler.java:512)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>>> SessionHandler.java:185)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>>> ContextHandler.java:1112)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:141)
>>>>>>           at org.eclipse.jetty.server.handler.
>> ContextHandlerCollection.ha
>>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.
>> handle(
>>>>>> HandlerCollection.java:119)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
>> Rewr
>>>>>> iteHandler.java:335)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>>> java:320)
>>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(
>> HttpConne
>>>>>> ction.java:251)
>>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>>> succeeded(AbstractConnection.java:273)
>>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>>> java:95)
>>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
>> SelectChann
>>>>>> elEndPoint.java:93)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool.runJob(Queued
>>>>>> ThreadPool.java:671)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool$2.run(QueuedT
>>>>>> hreadPool.java:589)
>>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
>> IOException):
>>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
>> 0000000000000006211
>>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>>> There
>>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>>> operation.
>>>>>>           at org.apache.hadoop.hdfs.server.
>> blockmanagement.BlockManager.c
>>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.FSNamesystem.getAddit
>>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.NameNodeRpcServer.add
>>>>>> Block(NameNodeRpcServer.java:683)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.AuthorizationProvider
>>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>>> tProtocol.java:214)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolServ
>>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>>> TranslatorPB.java:495)
>>>>>>           at org.apache.hadoop.hdfs.protocol.proto.
>> ClientNamenodeProtocol
>>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>>> enodeProtocolProtos.java)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> ProtoBufRpcIn
>>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2216)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2212)
>>>>>>           at java.security.AccessController.doPrivileged(Native
>> Method)
>>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGro
>>>>>> upInformation.java:1920)
>>>>>>           at org.apache.hadoop.ipc.Server$
>> Handler.run(Server.java:2210)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>>> ProtobufRpcEngine.java:229)
>>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolTran
>>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
>> Source)
>>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.
>> invoke(DelegatingMe
>>>>>> thodAccessorImpl.java:43)
>>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.
>> invokeMeth
>>>>>> od(RetryInvocationHandler.java:191)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
>> Ret
>>>>>> ryInvocationHandler.java:102)
>>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> locateFo
>>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> nextBloc
>>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>>> DFSOutputStream.java:449)
>>>>>>
>>>>>> 2017-07-17 12:29:24.164 ERROR (qtp985934102-5161057) [c:UNCLASS
>>>>>> s:shard43
>>>>>> r:core_node108 x:UNCLASS_shard43_replica1] o.a.s.s.HttpSolrCall null:
>>>>>> org.apache.solr.common.SolrException: Error logging add
>>>>>>           at org.apache.solr.update.TransactionLog.write(
>> TransactionLog.
>>>>>> java:418)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:532)
>>>>>>           at org.apache.solr.update.UpdateLog.add(UpdateLog.java:516)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> versionAdd(DistributedUpdateProcessor.java:1113)
>>>>>>           at org.apache.solr.update.processor.
>> DistributedUpdateProcessor.
>>>>>> processAdd(DistributedUpdateProcessor.java:748)
>>>>>>           at org.apache.solr.update.processor.
>> LogUpdateProcessorFactory$L
>>>>>> ogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader$1.update(
>> Javabi
>>>>>> nLoader.java:98)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:306)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec$1.readNamedList(JavaBinUpdateRequestCodec.java:122)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinC
>>>>>> odec.java:271)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.readVal(
>> JavaBinCode
>>>>>> c.java:251)
>>>>>>           at org.apache.solr.common.util.JavaBinCodec.unmarshal(
>> JavaBinCo
>>>>>> dec.java:173)
>>>>>>           at org.apache.solr.client.solrj.request.
>> JavaBinUpdateRequestCod
>>>>>> ec.unmarshal(JavaBinUpdateRequestCodec.java:187)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.
>> parseAndLoadDoc
>>>>>> s(JavabinLoader.java:108)
>>>>>>           at org.apache.solr.handler.loader.JavabinLoader.load(
>> JavabinLoa
>>>>>> der.java:55)
>>>>>>           at org.apache.solr.handler.UpdateRequestHandler$1.load(
>> UpdateRe
>>>>>> questHandler.java:97)
>>>>>>           at org.apache.solr.handler.ContentStreamHandlerBase.
>> handleReque
>>>>>> stBody(ContentStreamHandlerBase.java:68)
>>>>>>           at org.apache.solr.handler.RequestHandlerBase.
>> handleRequest(Req
>>>>>> uestHandlerBase.java:173)
>>>>>>           at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.execute(
>> HttpSolrCall.
>>>>>> java:723)
>>>>>>           at org.apache.solr.servlet.HttpSolrCall.call(
>> HttpSolrCall.java:
>>>>>> 529)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:361)
>>>>>>           at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>> SolrDisp
>>>>>> atchFilter.java:305)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
>> doFilte
>>>>>> r(ServletHandler.java:1691)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doHandle(
>> ServletHan
>>>>>> dler.java:582)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:143)
>>>>>>           at org.eclipse.jetty.security.SecurityHandler.handle(
>> SecurityHa
>>>>>> ndler.java:548)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>>>>>> SessionHandler.java:226)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>>>>>> ContextHandler.java:1180)
>>>>>>           at org.eclipse.jetty.servlet.ServletHandler.doScope(
>> ServletHand
>>>>>> ler.java:512)
>>>>>>           at org.eclipse.jetty.server.session.SessionHandler.doScope(
>>>>>> SessionHandler.java:185)
>>>>>>           at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>>>>>> ContextHandler.java:1112)
>>>>>>           at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>> Scoped
>>>>>> Handler.java:141)
>>>>>>           at org.eclipse.jetty.server.handler.
>> ContextHandlerCollection.ha
>>>>>> ndle(ContextHandlerCollection.java:213)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerCollection.
>> handle(
>>>>>> HandlerCollection.java:119)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
>> Rewr
>>>>>> iteHandler.java:335)
>>>>>>           at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>> Handl
>>>>>> erWrapper.java:134)
>>>>>>           at org.eclipse.jetty.server.Server.handle(Server.java:534)
>>>>>>           at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.
>>>>>> java:320)
>>>>>>           at org.eclipse.jetty.server.HttpConnection.onFillable(
>> HttpConne
>>>>>> ction.java:251)
>>>>>>           at org.eclipse.jetty.io.AbstractConnection$ReadCallback.
>>>>>> succeeded(AbstractConnection.java:273)
>>>>>>           at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.
>>>>>> java:95)
>>>>>>           at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
>> SelectChann
>>>>>> elEndPoint.java:93)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .executeProduceConsume(ExecuteProduceConsume.java:303)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .produceConsume(ExecuteProduceConsume.java:148)
>>>>>>           at org.eclipse.jetty.util.thread.
>> strategy.ExecuteProduceConsume
>>>>>> .run(ExecuteProduceConsume.java:136)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool.runJob(Queued
>>>>>> ThreadPool.java:671)
>>>>>>           at org.eclipse.jetty.util.thread.
>> QueuedThreadPool$2.run(QueuedT
>>>>>> hreadPool.java:589)
>>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>>> Caused by: org.apache.hadoop.ipc.RemoteException(java.io.
>> IOException):
>>>>>> File /solr6.6.0/UNCLASS/core_node108/data/tlog/tlog.
>> 0000000000000006211
>>>>>> could only be replicated to 0 nodes instead of minReplication (=1).
>>>>>> There
>>>>>> are 40 datanode(s) running and no node(s) are excluded in this
>>>>>> operation.
>>>>>>           at org.apache.hadoop.hdfs.server.
>> blockmanagement.BlockManager.c
>>>>>> hooseTarget4NewBlock(BlockManager.java:1622)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.FSNamesystem.getAddit
>>>>>> ionalBlock(FSNamesystem.java:3351)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.NameNodeRpcServer.add
>>>>>> Block(NameNodeRpcServer.java:683)
>>>>>>           at org.apache.hadoop.hdfs.server.
>> namenode.AuthorizationProvider
>>>>>> ProxyClientProtocol.addBlock(AuthorizationProviderProxyClien
>>>>>> tProtocol.java:214)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolServ
>>>>>> erSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSide
>>>>>> TranslatorPB.java:495)
>>>>>>           at org.apache.hadoop.hdfs.protocol.proto.
>> ClientNamenodeProtocol
>>>>>> Protos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNam
>>>>>> enodeProtocolProtos.java)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> ProtoBufRpcIn
>>>>>> voker.call(ProtobufRpcEngine.java:617)
>>>>>>           at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2216)
>>>>>>           at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:
>> 2212)
>>>>>>           at java.security.AccessController.doPrivileged(Native
>> Method)
>>>>>>           at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>>           at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGro
>>>>>> upInformation.java:1920)
>>>>>>           at org.apache.hadoop.ipc.Server$
>> Handler.run(Server.java:2210)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1475)
>>>>>>           at org.apache.hadoop.ipc.Client.call(Client.java:1412)
>>>>>>           at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(
>>>>>> ProtobufRpcEngine.java:229)
>>>>>>           at com.sun.proxy.$Proxy11.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.protocolPB.
>> ClientNamenodeProtocolTran
>>>>>> slatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:418)
>>>>>>           at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown
>> Source)
>>>>>>           at sun.reflect.DelegatingMethodAccessorImpl.
>> invoke(DelegatingMe
>>>>>> thodAccessorImpl.java:43)
>>>>>>           at java.lang.reflect.Method.invoke(Method.java:498)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.
>> invokeMeth
>>>>>> od(RetryInvocationHandler.java:191)
>>>>>>           at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(
>> Ret
>>>>>> ryInvocationHandler.java:102)
>>>>>>           at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> locateFo
>>>>>> llowingBlock(DFSOutputStream.java:1459)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.
>> nextBloc
>>>>>> kOutputStream(DFSOutputStream.java:1255)
>>>>>>           at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(
>>>>>> DFSOutputStream.java:449)
>>>>>>
>>>>>> 2017-07-17 12:29:24.187 INFO
>>>>>> (zkCallback-5-thread-144-processing-n:juliet:9100_solr)
>>>>>> [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent
>>>>>> state:SyncConnected type:NodeDataChanged
>>>>>> path:/collections/UNCLASS/state.json]
>>>>>> for collection [UNCLASS] has occurred - updating... (live nodes size:
>>>>>> [45])
>>>>>>
>>>>>> On the client side, the error looks like:
>>>>>> 2017-07-16 19:03:16,118 WARN
>>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>>> Indexing error: org.apache.solr.client.solrj.i
>>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>> for
>>>>>> collection: UNCLASS
>>>>>> org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException:
>> Error
>>>>>> from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>> Exception
>>>>>> writing document id COLLECT10086453202 to the index; possible analysis
>>>>>> error.
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.
>> directUpda
>>>>>> te(CloudSolrClient.java:819)
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.
>> sendReques
>>>>>> t(CloudSolrClient.java:1263)
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.
>> requestWit
>>>>>> hRetryOnStaleState(CloudSolrClient.java:1134)
>>>>>>           at org.apache.solr.client.solrj.
>> impl.CloudSolrClient.request(Cl
>>>>>> oudSolrClient.java:1073)
>>>>>>           at org.apache.solr.client.solrj.SolrRequest.process(
>> SolrRequest
>>>>>> .java:160)
>>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
>> java:
>>>>>> 106)
>>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
>> java:
>>>>>> 71)
>>>>>>           at org.apache.solr.client.solrj.SolrClient.add(SolrClient.
>> java:
>>>>>> 85)
>>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.
>> indexSolrDocs(
>>>>>> IndexDocument.java:959)
>>>>>>           at com.ngc.bigdata.ie_solrindexer.IndexDocument.index(
>>>>>> IndexDocument.java:236)
>>>>>>           at com.ngc.bigdata.ie_solrindexer.
>> SolrIndexerProcessor.doWork(S
>>>>>> olrIndexerProcessor.java:63)
>>>>>>           at com.ngc.intelenterprise.intelentutil.utils.Processor.run(
>>>>>> Processor.java:140)
>>>>>>           at com.ngc.intelenterprise.intelentutil.jms.
>> IntelEntQueueProc.
>>>>>> process(IntelEntQueueProc.java:208)
>>>>>>           at org.apache.camel.processor.DelegateSyncProcessor.process(
>> Del
>>>>>> egateSyncProcessor.java:63)
>>>>>>           at org.apache.camel.management.InstrumentationProcessor.
>> process
>>>>>> (InstrumentationProcessor.java:77)
>>>>>>           at org.apache.camel.processor.RedeliveryErrorHandler.
>> process(Re
>>>>>> deliveryErrorHandler.java:460)
>>>>>>           at org.apache.camel.processor.CamelInternalProcessor.
>> process(Ca
>>>>>> melInternalProcessor.java:190)
>>>>>>           at org.apache.camel.processor.CamelInternalProcessor.
>> process(Ca
>>>>>> melInternalProcessor.java:190)
>>>>>>           at org.apache.camel.component.seda.SedaConsumer.
>> sendToConsumers
>>>>>> (SedaConsumer.java:298)
>>>>>>           at org.apache.camel.component.seda.SedaConsumer.doRun(
>> SedaConsu
>>>>>> mer.java:207)
>>>>>>           at org.apache.camel.component.seda.SedaConsumer.run(
>> SedaConsume
>>>>>> r.java:154)
>>>>>>           at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPool
>>>>>> Executor.java:1142)
>>>>>>           at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoo
>>>>>> lExecutor.java:617)
>>>>>>           at java.lang.Thread.run(Thread.java:748)
>>>>>> Caused by:
>>>>>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
>>>>>> Error from server at http://leda:9100/solr/UNCLASS_shard44_replica3:
>>>>>> Exception writing document id COLLECT10086453202 to the index;
>> possible
>>>>>> analysis error.
>>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.
>> executeMeth
>>>>>> od(HttpSolrClient.java:610)
>>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
>> Htt
>>>>>> pSolrClient.java:279)
>>>>>>           at org.apache.solr.client.solrj.impl.HttpSolrClient.request(
>> Htt
>>>>>> pSolrClient.java:268)
>>>>>>           at org.apache.solr.client.solrj.impl.LBHttpSolrClient.
>> doRequest
>>>>>> (LBHttpSolrClient.java:447)
>>>>>>           at org.apache.solr.client.solrj.
>> impl.LBHttpSolrClient.request(L
>>>>>> BHttpSolrClient.java:388)
>>>>>>           at org.apache.solr.client.solrj.impl.CloudSolrClient.lambda$
>> dir
>>>>>> ectUpdate$0(CloudSolrClient.java:796)
>>>>>>           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>>>>           at org.apache.solr.common.util.ExecutorUtil$
>> MDCAwareThreadPoolE
>>>>>> xecutor.lambda$execute$0(ExecutorUtil.java:229)
>>>>>>           ... 3 more
>>>>>> 2017-07-16 19:03:16,134 ERROR
>>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>>> Error indexing: org.apache.solr.client.solrj.i
>>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>> for
>>>>>> collection: UNCLASS.
>>>>>> 2017-07-16 19:03:16,135 ERROR
>>>>>> [com.ngc.bigdata.ie_solrindexer.IndexDocument]
>>>>>> Exception during indexing: org.apache.solr.client.solrj.i
>>>>>> mpl.CloudSolrClient$RouteException: Error from server at
>>>>>> http://leda:9100/solr/UNCLASS_shard44_replica3: Exception writing
>>>>>> document id COLLECT10086453202 to the index; possible analysis error.
>>>>>>
>>>>>> I can fire them back up, but they only run for a short time before
>>>>>> getting more indexing errors.  Several of the nodes show as down in
>> the
>>>>>> cloud view.  Any help would be appreciated!  Thank you!
>>>>>>
>>>>>>
>>>>>> -Joe
>>>>>>
>>>>>>
>>>> ---
>>>> This email has been checked for viruses by AVG.
>>>> http://www.avg.com
>>>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Shawn Heisey-2
On 7/17/2017 11:39 AM, Joe Obernberger wrote:
> We use puppet to deploy the solr instance to all the nodes.  I changed
> what was deployed to use the CDH jars, but our puppet module deletes
> the old directory and replaces it.  So, all the core configuration
> files under server/solr/ were removed. Zookeeper still has the
> configuration, but the nodes won't come up.
>
> Is there a way around this?  Re-creating these files manually isn't
> realistic; do I need to re-index?

Put the solr home elsewhere so it's not under the program directory and
doesn't get deleted when you re-deploy Solr.  When starting Solr
manually with bin/solr, this is done with the -s option.

If you install Solr as a service, which works on operating systems with
a strong GNU presence (such as Linux), then the solr home will typically
not be in the program directory.  The configuration script (default
filename is /etc/default/solr.in.sh) should not get deleted if Solr is
reinstalled, but I have not confirmed that this is the case.  The
service installer script is included in the Solr download.

With SolrCloud, deleting all the core data like that will NOT be
automatically fixed by restarting Solr.  SolrCloud will have lost part
of its data.  If you have enough replicas left after a losslike that to
remain fully operational, then you'll need to use the DELETEREPLICA and
ADDREPLICA actions on the Collections API to rebuild the data on that
server from the leader of each shard.

If the collection is incomplete after the solr home on a server gets
deleted, you'll probably need to completely delete the collection, then
recreate it, and reindex.  And you'll need to look into adding
servers/replicas so the loss of a single server cannot take you offline.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6.0 - Indexing errors

Joe Obernberger
Thank you Shawn.  We will be adjusting solr.solr.home to point some
place else so that our puppet module will work.  We actually didn't
loose any data since the indexes are in HDFS.  Our configuration for our
largest collection is 100 shards with 3 replicas each on top of HDFS
with 3x replication.  Perhaps overkill.  It's just the core properties
files that we lost.  I ended up writing a program that uses the
CloudSolrClient to get all the info from zookeeper and then rebuild the
core properties files.  Looks like it is working.  For example, for a
collection called COL1 with config called COL1:

         File output;
         Iterator<Slice> iSlice =
mainServer.getZkStateReader().getClusterState().getCollection("COL1").getActiveSlices().iterator();
         while (iSlice != null && iSlice.hasNext()) {
             Slice s = iSlice.next();
             Iterator<Replica> replicaIt = s.getReplicas().iterator();
             while (replicaIt != null && replicaIt.hasNext()) {
                 Replica r = replicaIt.next();
                 System.out.println("Name: "+r.getCoreName());
                 System.out.println("CodeNodeName: "+r.getName());
                 System.out.println("Node name: "+r.getNodeName());
                 System.out.println("Shard: "+s.getName());

                 output = new File(r.getNodeName()+"/"+r.getCoreName());
                 output.mkdirs();
                 output = new
File(r.getNodeName()+"/"+r.getCoreName()+"/"+"core.properties");
                 StringBuilder buff = new StringBuilder();
                 buff.append("collection.configName=COL1\n");
                 buff.append("name=").append(r.getCoreName());
                 buff.append("\nshard=").append(s.getName());
                 buff.append("\ncollection=COL1");
buff.append("\ncoreNodeName=").append(r.getName());
                 try {
                     setContents(output, buff.toString());
                 } catch (IOException ex) {
                     System.out.println("Error writting: "+ex);
                 }
             }
         }


Then I copied the files to the 45 servers and restarted solr 6.6.0 on
each.  It came back up OK, and it has been indexing all night long.

-Joe

On 7/17/2017 3:15 PM, Erick Erickson wrote


On 7/18/2017 12:31 PM, Shawn Heisey wrote:

> On 7/17/2017 11:39 AM, Joe Obernberger wrote:
>> We use puppet to deploy the solr instance to all the nodes.  I
>> changed what was deployed to use the CDH jars, but our puppet module
>> deletes the old directory and replaces it.  So, all the core
>> configuration files under server/solr/ were removed. Zookeeper still
>> has the configuration, but the nodes won't come up.
>>
>> Is there a way around this?  Re-creating these files manually isn't
>> realistic; do I need to re-index?
>
> Put the solr home elsewhere so it's not under the program directory
> and doesn't get deleted when you re-deploy Solr.  When starting Solr
> manually with bin/solr, this is done with the -s option.
>
> If you install Solr as a service, which works on operating systems
> with a strong GNU presence (such as Linux), then the solr home will
> typically not be in the program directory.  The configuration script
> (default filename is /etc/default/solr.in.sh) should not get deleted
> if Solr is reinstalled, but I have not confirmed that this is the
> case.  The service installer script is included in the Solr download.
>
> With SolrCloud, deleting all the core data like that will NOT be
> automatically fixed by restarting Solr.  SolrCloud will have lost part
> of its data.  If you have enough replicas left after a losslike that
> to remain fully operational, then you'll need to use the DELETEREPLICA
> and ADDREPLICA actions on the Collections API to rebuild the data on
> that server from the leader of each shard.
>
> If the collection is incomplete after the solr home on a server gets
> deleted, you'll probably need to completely delete the collection,
> then recreate it, and reindex.  And you'll need to look into adding
> servers/replicas so the loss of a single server cannot take you offline.
>
> Thanks,
> Shawn
>
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>

Loading...