SOLR Cloud: Few cores goes to recovery mode all of a sudden

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

SOLR Cloud: Few cores goes to recovery mode all of a sudden

Doss
Hi,

We are using 3 node SOLR (7.0.1) cloud setup 1 node zookeeper ensemble.
Each system has 16CPUs, 90GB RAM (14GB HEAP), 130 cores (3 replicas NRT)
with index size ranging from 700MB to 20GB.

autoCommit - 10 minutes once
softCommit - 30 Sec Once

We are facing the following problems in recent times

============================================================================================


1. Few Indexes goes to recovery mode without any prior error or warnings

2019-09-03 08:42:41.650 ERROR (qtp959447386-38425) [c:viewsindex s:shard1
r:core_node3 x:viewsindex] o.a.s.h.RequestHandlerBase or
g.apache.solr.common.SolrException: SolrCloud join: memberdetails has a
local replica (memberdetails) on 12.1.2.10:8983_solr, but it is down
        at
org.apache.solr.search.join.ScoreJoinQParserPlugin.findLocalReplicaForFromIndex(ScoreJoinQParserPlugin.java:325)

        at
org.apache.solr.search.join.ScoreJoinQParserPlugin.getCoreName(ScoreJoinQParserPlugin.java:285)

        at
org.apache.solr.search.JoinQParserPlugin$1.parseJoin(JoinQParserPlugin.java:92)

        at
org.apache.solr.search.JoinQParserPlugin$1.parse(JoinQParserPlugin.java:74)
        at org.apache.solr.search.QParser.getQuery(QParser.java:168)
        at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:207)

        at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269)

        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)

        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
        at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:378)

        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:322)

        at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)

        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)

        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)

        at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)

        at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
        at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)

        at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)

        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

        at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)

        at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)

        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)

        at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)

        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)

        at org.eclipse.jetty.server.Server.handle(Server.java:534)
        at
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
        at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)

        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
        at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)

        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)

        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)

        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)

        at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)

        at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)

        at java.lang.Thread.run(Thread.java:748)

============================================================================================


2. Few indexes went to "Error opening new searcher"

null:org.apache.solr.common.SolrException: Error opening new searcher
        at
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2066)
        at
org.apache.solr.core.SolrCore.getRealtimeSearcher(SolrCore.java:1915)
        at
org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:615)

        at
org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:584)

        at
org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:1377)

        at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1090)

        at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:753)

        at
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)

        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:188)

        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:144)

        at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:311)
        at
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:130)

        at
org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:276)
        at
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
        at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:178)
        at
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:195)

        at
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)

        at
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
        at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)

        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)

        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)

        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
        at
org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:378)

        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:322)

        at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)

        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)

        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)

        at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)

        at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
        at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)

        at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)

        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

        at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)

        at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)

        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)

        at
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)

        at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)

        at org.eclipse.jetty.server.Server.handle(Server.java:534)
        at
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
        at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)

        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
        at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)

        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)

        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)

        at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)

        at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)

        at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)

        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.lucene.store.AlreadyClosedException: Already closed
        at
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:337)

        at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:349)
        at
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1948)

============================================================================================


3. Unable to create new native thread

2019-09-03 08:42:36.460 WARN  (qtp1348949648-20980) [c:viewsindex s:shard1
r:core_node5 x:viewsindex] o.e.j.u.t.QueuedThreadPool
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:714)
        at
org.eclipse.jetty.util.thread.QueuedThreadPool.startThreads(QueuedThreadPool.java:475)

        at
org.eclipse.jetty.util.thread.QueuedThreadPool.access$200(QueuedThreadPool.java:48)

        at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:634)

        at java.lang.Thread.run(Thread.java:745)

============================================================================================


What could be the reason how can we avoid such failures in future?

How can we fix "AlreadyClosedException" without restarting the service.


Thanks,
Doss.
Reply | Threaded
Open this post in threaded view
|

Re: SOLR Cloud: Few cores goes to recovery mode all of a sudden

Erick Erickson
The “unable to create new thread” is where I’d focus first. It means you’re running out of some system resources and it’s quite possible that your other problems are arising from that root cause.

What are you “ulimit” settings? the number of file handles and processes should be set to 65k at least.

Best,
Erick

> On Sep 3, 2019, at 9:38 AM, Doss <[hidden email]> wrote:
>
> Hi,
>
> We are using 3 node SOLR (7.0.1) cloud setup 1 node zookeeper ensemble.
> Each system has 16CPUs, 90GB RAM (14GB HEAP), 130 cores (3 replicas NRT)
> with index size ranging from 700MB to 20GB.
>
> autoCommit - 10 minutes once
> softCommit - 30 Sec Once
>
> We are facing the following problems in recent times
>
> ============================================================================================
>
>
> 1. Few Indexes goes to recovery mode without any prior error or warnings
>
> 2019-09-03 08:42:41.650 ERROR (qtp959447386-38425) [c:viewsindex s:shard1
> r:core_node3 x:viewsindex] o.a.s.h.RequestHandlerBase or
> g.apache.solr.common.SolrException: SolrCloud join: memberdetails has a
> local replica (memberdetails) on 12.1.2.10:8983_solr, but it is down
>        at
> org.apache.solr.search.join.ScoreJoinQParserPlugin.findLocalReplicaForFromIndex(ScoreJoinQParserPlugin.java:325)
>
>        at
> org.apache.solr.search.join.ScoreJoinQParserPlugin.getCoreName(ScoreJoinQParserPlugin.java:285)
>
>        at
> org.apache.solr.search.JoinQParserPlugin$1.parseJoin(JoinQParserPlugin.java:92)
>
>        at
> org.apache.solr.search.JoinQParserPlugin$1.parse(JoinQParserPlugin.java:74)
>        at org.apache.solr.search.QParser.getQuery(QParser.java:168)
>        at
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:207)
>
>        at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269)
>
>        at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
>        at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
>        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:378)
>
>        at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:322)
>
>        at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>
>        at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>        at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>
>        at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>        at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>
>        at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>
>        at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>        at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>
>        at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>
>        at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>
>        at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>
>        at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>
>        at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>
>        at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>
>        at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>
>        at org.eclipse.jetty.server.Server.handle(Server.java:534)
>        at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>        at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>        at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>
>        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>        at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>
>        at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>
>        at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>
>        at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>
>        at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>
>        at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>
>        at java.lang.Thread.run(Thread.java:748)
>
> ============================================================================================
>
>
> 2. Few indexes went to "Error opening new searcher"
>
> null:org.apache.solr.common.SolrException: Error opening new searcher
>        at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2066)
>        at
> org.apache.solr.core.SolrCore.getRealtimeSearcher(SolrCore.java:1915)
>        at
> org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:615)
>
>        at
> org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:584)
>
>        at
> org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:1377)
>
>        at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1090)
>
>        at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:753)
>
>        at
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
>
>        at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:188)
>
>        at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:144)
>
>        at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:311)
>        at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>        at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:130)
>
>        at
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:276)
>        at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
>        at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:178)
>        at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:195)
>
>        at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
>
>        at
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>        at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
>
>        at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
>
>        at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
>
>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
>        at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
>        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
>        at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:378)
>
>        at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:322)
>
>        at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
>
>        at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
>        at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>
>        at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>        at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>
>        at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>
>        at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
>        at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>
>        at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>
>        at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>
>        at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>
>        at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>
>        at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>
>        at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>
>        at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>
>        at org.eclipse.jetty.server.Server.handle(Server.java:534)
>        at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
>        at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>        at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
>
>        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
>        at
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>
>        at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>
>        at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>
>        at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>
>        at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>
>        at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>
>        at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.lucene.store.AlreadyClosedException: Already closed
>        at
> org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:337)
>
>        at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:349)
>        at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1948)
>
> ============================================================================================
>
>
> 3. Unable to create new native thread
>
> 2019-09-03 08:42:36.460 WARN  (qtp1348949648-20980) [c:viewsindex s:shard1
> r:core_node5 x:viewsindex] o.e.j.u.t.QueuedThreadPool
> java.lang.OutOfMemoryError: unable to create new native thread
>        at java.lang.Thread.start0(Native Method)
>        at java.lang.Thread.start(Thread.java:714)
>        at
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThreads(QueuedThreadPool.java:475)
>
>        at
> org.eclipse.jetty.util.thread.QueuedThreadPool.access$200(QueuedThreadPool.java:48)
>
>        at
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:634)
>
>        at java.lang.Thread.run(Thread.java:745)
>
> ============================================================================================
>
>
> What could be the reason how can we avoid such failures in future?
>
> How can we fix "AlreadyClosedException" without restarting the service.
>
>
> Thanks,
> Doss.

Reply | Threaded
Open this post in threaded view
|

Re: SOLR Cloud: Few cores goes to recovery mode all of a sudden

Doss
Thanks Erick,

ulimit in all three lodes are more than 65K including max process list.  If
you look at the timestamp the core down error happened ahead of unable to
create thread error, and also core down error took place in node1 and
unable to create thread error took place in node3.

BTW we are running the SOLR instances in VMs. We are maintaining only last
three days zookeeper logs.

Thanks,
Doss.

On Tue, Sep 3, 2019 at 8:21 PM Erick Erickson <[hidden email]>
wrote:

> The “unable to create new thread” is where I’d focus first. It means
> you’re running out of some system resources and it’s quite possible that
> your other problems are arising from that root cause.
>
> What are you “ulimit” settings? the number of file handles and processes
> should be set to 65k at least.
>
> Best,
> Erick
>
> > On Sep 3, 2019, at 9:38 AM, Doss <[hidden email]> wrote:
> >
> > Hi,
> >
> > We are using 3 node SOLR (7.0.1) cloud setup 1 node zookeeper ensemble.
> > Each system has 16CPUs, 90GB RAM (14GB HEAP), 130 cores (3 replicas NRT)
> > with index size ranging from 700MB to 20GB.
> >
> > autoCommit - 10 minutes once
> > softCommit - 30 Sec Once
> >
> > We are facing the following problems in recent times
> >
> >
> ============================================================================================
> >
> >
> > 1. Few Indexes goes to recovery mode without any prior error or warnings
> >
> > 2019-09-03 08:42:41.650 ERROR (qtp959447386-38425) [c:viewsindex s:shard1
> > r:core_node3 x:viewsindex] o.a.s.h.RequestHandlerBase or
> > g.apache.solr.common.SolrException: SolrCloud join: memberdetails has a
> > local replica (memberdetails) on 12.1.2.10:8983_solr, but it is down
> >        at
> >
> org.apache.solr.search.join.ScoreJoinQParserPlugin.findLocalReplicaForFromIndex(ScoreJoinQParserPlugin.java:325)
> >
> >        at
> >
> org.apache.solr.search.join.ScoreJoinQParserPlugin.getCoreName(ScoreJoinQParserPlugin.java:285)
> >
> >        at
> >
> org.apache.solr.search.JoinQParserPlugin$1.parseJoin(JoinQParserPlugin.java:92)
> >
> >        at
> >
> org.apache.solr.search.JoinQParserPlugin$1.parse(JoinQParserPlugin.java:74)
> >        at org.apache.solr.search.QParser.getQuery(QParser.java:168)
> >        at
> >
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:207)
> >
> >        at
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269)
> >
> >        at
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
> >
> >        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
> >        at
> > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
> >        at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
> >        at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:378)
> >
> >        at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:322)
> >
> >        at
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> >
> >        at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> >        at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >
> >        at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >        at
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> >
> >        at
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> >        at
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> >
> >        at
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> >
> >        at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >        at
> > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> >        at
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> >        at
> > org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> >
> >        at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:95)
> >        at
> > org.eclipse.jetty.io
> .SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> >
> >        at java.lang.Thread.run(Thread.java:748)
> >
> >
> ============================================================================================
> >
> >
> > 2. Few indexes went to "Error opening new searcher"
> >
> > null:org.apache.solr.common.SolrException: Error opening new searcher
> >        at
> > org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2066)
> >        at
> > org.apache.solr.core.SolrCore.getRealtimeSearcher(SolrCore.java:1915)
> >        at
> >
> org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:615)
> >
> >        at
> >
> org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:584)
> >
> >        at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:1377)
> >
> >        at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1090)
> >
> >        at
> >
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:753)
> >
> >        at
> >
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:98)
> >
> >        at
> >
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:188)
> >
> >        at
> >
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:144)
> >
> >        at
> >
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:311)
> >        at
> > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
> >        at
> >
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:130)
> >
> >        at
> >
> org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:276)
> >        at
> > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
> >        at
> > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:178)
> >        at
> >
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:195)
> >
> >        at
> >
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:108)
> >
> >        at
> > org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
> >        at
> >
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> >
> >        at
> >
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
> >
> >        at
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)
> >
> >        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2474)
> >        at
> > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:720)
> >        at
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:526)
> >        at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:378)
> >
> >        at
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:322)
> >
> >        at
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> >
> >        at
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> >        at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> >
> >        at
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> >        at
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> >
> >        at
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> >        at
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> >
> >        at
> >
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> >
> >        at
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> >
> >        at org.eclipse.jetty.server.Server.handle(Server.java:534)
> >        at
> > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
> >        at
> >
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
> >        at
> > org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> >
> >        at org.eclipse.jetty.io
> .FillInterest.fillable(FillInterest.java:95)
> >        at
> > org.eclipse.jetty.io
> .SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
> >
> >        at java.lang.Thread.run(Thread.java:748)
> > Caused by: org.apache.lucene.store.AlreadyClosedException: Already closed
> >        at
> >
> org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:337)
> >
> >        at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:349)
> >        at
> > org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1948)
> >
> >
> ============================================================================================
> >
> >
> > 3. Unable to create new native thread
> >
> > 2019-09-03 08:42:36.460 WARN  (qtp1348949648-20980) [c:viewsindex
> s:shard1
> > r:core_node5 x:viewsindex] o.e.j.u.t.QueuedThreadPool
> > java.lang.OutOfMemoryError: unable to create new native thread
> >        at java.lang.Thread.start0(Native Method)
> >        at java.lang.Thread.start(Thread.java:714)
> >        at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.startThreads(QueuedThreadPool.java:475)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.access$200(QueuedThreadPool.java:48)
> >
> >        at
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:634)
> >
> >        at java.lang.Thread.run(Thread.java:745)
> >
> >
> ============================================================================================
> >
> >
> > What could be the reason how can we avoid such failures in future?
> >
> > How can we fix "AlreadyClosedException" without restarting the service.
> >
> >
> > Thanks,
> > Doss.
>
>