Shard and replica went down in Solr 6.1.0

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Shard and replica went down in Solr 6.1.0

vishal patel-2
I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica went down and I got below ERROR

2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher Opening [Searcher@24b9127f[product1] main]
2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit: commits: num=2
commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] PERFORMANCE WARNING: Overlapping onDeckSearchers=2

2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
org.apache.solr.common.SolrException: Service Unavailable

request: http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Note : product1,product2 and product3 are my collection.

In my solrconfig.xml
<autoCommit>
<maxTime>600000</maxTime>
    <maxDocs>20000</maxDocs>
    <openSearcher>false</openSearcher>
</autoCommit>

<autoSoftCommit>
<maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
</autoSoftCommit>
<maxWarmingSearchers>2</maxWarmingSearchers>

There are many documents committed at that time and I found out so many commitScheduler threads in Log.
Solr went down due to warn PERFORMANCE WARNING: Overlapping onDeckSearchers=2 is it possible?
Need to update my autoCommit or  maxWarmingSearchers?

Sent from Outlook<http://aka.ms/weboutlook>
Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

Erick Erickson
Solr will not go down due to “performance warning”. All that means is that you’re opening too many searchers in a short time. If you’re somehow opening a huge number of searchers, then maybe but IIRC you can’t really open more than two. And don’t think you can “fix” this by upping maxWarmingSearchers, that’ll only make the problem worse as every new searcher chews up memory that’s kept until the old searcher is done with outstanding requests.

Which is weird because with those settings you shouldn’t be opening _any_ new searchers. So my guess is that some external client is doing that and this is usually an anti-pattern. Don’t do it, please. Just set your soft commit to what you need and leave it at that. As long as possible. Here are all the gory details:
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

I’d also not bother with maxDocs, that’s much less predictable than just setting a reasonable time for autocommit, a minute or two is usually reasonable especially when openSearcher is false.

Not sure what’s really generating that error, take a look at all your other Solr logs to see if there’s a cause.

Best,
Erick


> On Apr 10, 2019, at 5:21 AM, vishal patel <[hidden email]> wrote:
>
> I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica went down and I got below ERROR
>
> 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher Opening [Searcher@24b9127f[product1] main]
> 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit: commits: num=2
> commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
> 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
>
> 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> org.apache.solr.common.SolrException: Service Unavailable
>
> request: http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> Note : product1,product2 and product3 are my collection.
>
> In my solrconfig.xml
> <autoCommit>
> <maxTime>600000</maxTime>
>    <maxDocs>20000</maxDocs>
>    <openSearcher>false</openSearcher>
> </autoCommit>
>
> <autoSoftCommit>
> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> </autoSoftCommit>
> <maxWarmingSearchers>2</maxWarmingSearchers>
>
> There are many documents committed at that time and I found out so many commitScheduler threads in Log.
> Solr went down due to warn PERFORMANCE WARNING: Overlapping onDeckSearchers=2 is it possible?
> Need to update my autoCommit or  maxWarmingSearchers?
>
> Sent from Outlook<http://aka.ms/weboutlook>

Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

vishal patel-2

Thanks Erick,
I got your point. As per you, Solr will not go down due to “performance warning” and no need to change the maxdoc value. You talked about number of searchers but in solrconfig.xml there is only 2.[<maxWarmingSearchers>2</maxWarmingSearchers>]

In production, we have 27 collection, 2 shards and 2 replicas and 3 zoo keepers and more than 30000 documents indexed within a 10 minutes.
How can I know how many searchers are open at the specific time?
As per my understanding, solr searcher will open when soft or hard commit. Am I right? And my commit time below
<autoCommit>
<maxTime>600000</maxTime>
    <maxDocs>20000</maxDocs>
    <openSearcher>false</openSearcher>
</autoCommit>

<autoSoftCommit>
<maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
</autoSoftCommit>

I do not write any code for opening the solr index searcher and can not find any error related that.

My all collection has same configuration mean same hard and soft commit time for all 27 collection. is it any issue if on same time two or more collections will come for hard commit?

Below, again send more and accurate log details.

Shard-1 Log ::
--------------
2019-04-08 12:54:01.395 INFO  (commitScheduler-102-thread-1) [c:collection1 s:shard1 r:core_node1 x:collection1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2019-04-08 12:54:01.395 INFO  (commitScheduler-118-thread-1) [c:collection2 s:shard1 r:core_node1 x:collection2] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2019-04-08 12:54:01.395 INFO  (commitScheduler-110-thread-1) [c:collection3 s:shard1 r:core_node1 x:collection3] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2019-04-08 12:54:01.394 INFO  (commitScheduler-109-thread-1) [c:collection4 s:shard1 r:core_node1 x:collection4] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
2019-04-08 12:54:01.394 INFO  (commitScheduler-99-thread-1) [c:collection5 s:shard1 r:core_node1 x:collection5] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

2019-04-08 12:54:01.405 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1 r:core_node1 x:product] o.a.s.u.StreamingSolrClients error
org.apache.solr.common.SolrException: Service Unavailable



request: http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Shard-1-replica-1 Log ::
--------------
2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader
at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:518)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
at java.lang.Thread.run(Thread.java:745)

After this one by one all collection got down and We can not identify that why this ERROR came and what is meaning of that error.

In Production my version
zookeeper : 3.4.6
java : 8
solrj : 5.2.0

Please guide me...
________________________________
From: Erick Erickson <[hidden email]>
Sent: Wednesday, April 10, 2019 8:35 PM
To: [hidden email]
Subject: Re: Shard and replica went down in Solr 6.1.0

Solr will not go down due to “performance warning”. All that means is that you’re opening too many searchers in a short time. If you’re somehow opening a huge number of searchers, then maybe but IIRC you can’t really open more than two. And don’t think you can “fix” this by upping maxWarmingSearchers, that’ll only make the problem worse as every new searcher chews up memory that’s kept until the old searcher is done with outstanding requests.

Which is weird because with those settings you shouldn’t be opening _any_ new searchers. So my guess is that some external client is doing that and this is usually an anti-pattern. Don’t do it, please. Just set your soft commit to what you need and leave it at that. As long as possible. Here are all the gory details:
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

I’d also not bother with maxDocs, that’s much less predictable than just setting a reasonable time for autocommit, a minute or two is usually reasonable especially when openSearcher is false.

Not sure what’s really generating that error, take a look at all your other Solr logs to see if there’s a cause.

Best,
Erick


> On Apr 10, 2019, at 5:21 AM, vishal patel <[hidden email]> wrote:
>
> I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica went down and I got below ERROR
>
> 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher Opening [Searcher@24b9127f[product1] main]
> 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit: commits: num=2
> commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
> 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
>
> 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> org.apache.solr.common.SolrException: Service Unavailable
>
> request: http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> Note : product1,product2 and product3 are my collection.
>
> In my solrconfig.xml
> <autoCommit>
> <maxTime>600000</maxTime>
>    <maxDocs>20000</maxDocs>
>    <openSearcher>false</openSearcher>
> </autoCommit>
>
> <autoSoftCommit>
> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> </autoSoftCommit>
> <maxWarmingSearchers>2</maxWarmingSearchers>
>
> There are many documents committed at that time and I found out so many commitScheduler threads in Log.
> Solr went down due to warn PERFORMANCE WARNING: Overlapping onDeckSearchers=2 is it possible?
> Need to update my autoCommit or  maxWarmingSearchers?
>
> Sent from Outlook<http://aka.ms/weboutlook>

Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

vishal patel-2
Actually in our application , bulky documents are needed to index and on same time we want to see that documents. So in production we keep auto commit 10 minutes and auto soft commit 1 second.
Is it ohk?

Get Outlook for Android<https://aka.ms/ghei36>



________________________________
From: Erick Erickson <[hidden email]>
Sent: Thursday, April 11, 2019 10:23:00 PM
To: vishal patel
Subject: Re: Shard and replica went down in Solr 6.1.0

We’re not quite on the same page.

These config options will _not_ open any new searchers. Period. They’re not the source of your max warming searchers error. Therefore _somebody_ is issuing commits. You need to find out who and stop them.

Then change your settings in solrconfig to
1> remove maxDoc from autoCommit. It’s probably leading to useless work
2> I’d shorten my maxTime in autoCommit to, say, 1 minute. This isn’t very important, but 10 minutes is unnecessary
3> Change your autoSoftCommit to as long as your application can tolerate, say 10 minutes if possible.
4> Find out who is issuing commits and stop them.

With these settings, unless you have outrageous “autowarmCount” settings in solrconfig.xml for the caches, you should not see any overlapping on deck searchers. I usually start with autowarmCount settings in the 10-16 range.

Best,
Erick

> On Apr 11, 2019, at 5:40 AM, vishal patel <[hidden email]> wrote:
>
>
> Thanks Erick,
> I got your point. As per you, Solr will not go down due to “performance warning” and no need to change the maxdoc value. You talked about number of searchers but in solrconfig.xml there is only 2.[<maxWarmingSearchers>2</maxWarmingSearchers>]
>
> In production, we have 27 collection, 2 shards and 2 replicas and 3 zoo keepers and more than 30000 documents indexed within a 10 minutes.
> How can I know how many searchers are open at the specific time?
> As per my understanding, solr searcher will open when soft or hard commit. Am I right? And my commit time below
> <autoCommit>
> <maxTime>600000</maxTime>
>     <maxDocs>20000</maxDocs>
>     <openSearcher>false</openSearcher>
> </autoCommit>
>
> <autoSoftCommit>
> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> </autoSoftCommit>
>
> I do not write any code for opening the solr index searcher and can not find any error related that.
>
> My all collection has same configuration mean same hard and soft commit time for all 27 collection. is it any issue if on same time two or more collections will come for hard commit?
>
> Below, again send more and accurate log details.
>
> Shard-1 Log ::
> --------------
> 2019-04-08 12:54:01.395 INFO  (commitScheduler-102-thread-1) [c:collection1 s:shard1 r:core_node1 x:collection1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 2019-04-08 12:54:01.395 INFO  (commitScheduler-118-thread-1) [c:collection2 s:shard1 r:core_node1 x:collection2] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 2019-04-08 12:54:01.395 INFO  (commitScheduler-110-thread-1) [c:collection3 s:shard1 r:core_node1 x:collection3] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 2019-04-08 12:54:01.394 INFO  (commitScheduler-109-thread-1) [c:collection4 s:shard1 r:core_node1 x:collection4] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 2019-04-08 12:54:01.394 INFO  (commitScheduler-99-thread-1) [c:collection5 s:shard1 r:core_node1 x:collection5] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>
> 2019-04-08 12:54:01.405 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1 r:core_node1 x:product] o.a.s.u.StreamingSolrClients error
> org.apache.solr.common.SolrException: Service Unavailable
>
>
>
> request: http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
>
> Shard-1-replica-1 Log ::
> --------------
> 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
> 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader
> at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
> at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
> at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
> at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
> at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
> at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
> at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
> at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
> at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
> at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
> at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
> at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
> at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
> at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
> at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
> at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
> at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
> at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
> at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
> at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
> at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
> at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
> at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
> at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> at org.eclipse.jetty.server.Server.handle(Server.java:518)
> at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
> at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
> at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
> at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> at java.lang.Thread.run(Thread.java:745)
>
> After this one by one all collection got down and We can not identify that why this ERROR came and what is meaning of that error.
>
> In Production my version
> zookeeper : 3.4.6
> java : 8
> solrj : 5.2.0
>
> Please guide me...
> From: Erick Erickson <[hidden email]>
> Sent: Wednesday, April 10, 2019 8:35 PM
> To: [hidden email]
> Subject: Re: Shard and replica went down in Solr 6.1.0
>
> Solr will not go down due to “performance warning”. All that means is that you’re opening too many searchers in a short time. If you’re somehow opening a huge number of searchers, then maybe but IIRC you can’t really open more than two. And don’t think you can “fix” this by upping maxWarmingSearchers, that’ll only make the problem worse as every new searcher chews up memory that’s kept until the old searcher is done with outstanding requests.
>
> Which is weird because with those settings you shouldn’t be opening _any_ new searchers. So my guess is that some external client is doing that and this is usually an anti-pattern. Don’t do it, please. Just set your soft commit to what you need and leave it at that. As long as possible. Here are all the gory details:
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> I’d also not bother with maxDocs, that’s much less predictable than just setting a reasonable time for autocommit, a minute or two is usually reasonable especially when openSearcher is false.
>
> Not sure what’s really generating that error, take a look at all your other Solr logs to see if there’s a cause.
>
> Best,
> Erick
>
>
> > On Apr 10, 2019, at 5:21 AM, vishal patel <[hidden email]> wrote:
> >
> > I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica went down and I got below ERROR
> >
> > 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher Opening [Searcher@24b9127f[product1] main]
> > 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit: commits: num=2
> > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> > 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
> > 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
> >
> > 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> > org.apache.solr.common.SolrException: Service Unavailable
> >
> > request: http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > Note : product1,product2 and product3 are my collection.
> >
> > In my solrconfig.xml
> > <autoCommit>
> > <maxTime>600000</maxTime>
> >    <maxDocs>20000</maxDocs>
> >    <openSearcher>false</openSearcher>
> > </autoCommit>
> >
> > <autoSoftCommit>
> > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > </autoSoftCommit>
> > <maxWarmingSearchers>2</maxWarmingSearchers>
> >
> > There are many documents committed at that time and I found out so many commitScheduler threads in Log.
> > Solr went down due to warn PERFORMANCE WARNING: Overlapping onDeckSearchers=2 is it possible?
> > Need to update my autoCommit or  maxWarmingSearchers?
> >
> > Sent from Outlook<http://aka.ms/weboutlook>

Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

Tulsi
Hi Vishal,
autosoftcommit of 1 sec could be the reason for this issue.
Choose autosoftcommit value as long an interval as your application can
tolerate, often 15-60 seconds is reasonable, or even longer depending on
the requirements
Also consider disabling your caches (queryResultCache and filterCache
especially) unless its required.

On Thu, Apr 11, 2019 at 10:40 PM vishal patel <[hidden email]>
wrote:

> Actually in our application , bulky documents are needed to index and on
> same time we want to see that documents. So in production we keep auto
> commit 10 minutes and auto soft commit 1 second.
> Is it ohk?
>
> Get Outlook for Android<https://aka.ms/ghei36>
>
>
>
> ________________________________
> From: Erick Erickson <[hidden email]>
> Sent: Thursday, April 11, 2019 10:23:00 PM
> To: vishal patel
> Subject: Re: Shard and replica went down in Solr 6.1.0
>
> We’re not quite on the same page.
>
> These config options will _not_ open any new searchers. Period. They’re
> not the source of your max warming searchers error. Therefore _somebody_ is
> issuing commits. You need to find out who and stop them.
>
> Then change your settings in solrconfig to
> 1> remove maxDoc from autoCommit. It’s probably leading to useless work
> 2> I’d shorten my maxTime in autoCommit to, say, 1 minute. This isn’t very
> important, but 10 minutes is unnecessary
> 3> Change your autoSoftCommit to as long as your application can tolerate,
> say 10 minutes if possible.
> 4> Find out who is issuing commits and stop them.
>
> With these settings, unless you have outrageous “autowarmCount” settings
> in solrconfig.xml for the caches, you should not see any overlapping on
> deck searchers. I usually start with autowarmCount settings in the 10-16
> range.
>
> Best,
> Erick
>
> > On Apr 11, 2019, at 5:40 AM, vishal patel <[hidden email]>
> wrote:
> >
> >
> > Thanks Erick,
> > I got your point. As per you, Solr will not go down due to “performance
> warning” and no need to change the maxdoc value. You talked about number of
> searchers but in solrconfig.xml there is only
> 2.[<maxWarmingSearchers>2</maxWarmingSearchers>]
> >
> > In production, we have 27 collection, 2 shards and 2 replicas and 3 zoo
> keepers and more than 30000 documents indexed within a 10 minutes.
> > How can I know how many searchers are open at the specific time?
> > As per my understanding, solr searcher will open when soft or hard
> commit. Am I right? And my commit time below
> > <autoCommit>
> > <maxTime>600000</maxTime>
> >     <maxDocs>20000</maxDocs>
> >     <openSearcher>false</openSearcher>
> > </autoCommit>
> >
> > <autoSoftCommit>
> > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > </autoSoftCommit>
> >
> > I do not write any code for opening the solr index searcher and can not
> find any error related that.
> >
> > My all collection has same configuration mean same hard and soft commit
> time for all 27 collection. is it any issue if on same time two or more
> collections will come for hard commit?
> >
> > Below, again send more and accurate log details.
> >
> > Shard-1 Log ::
> > --------------
> > 2019-04-08 12:54:01.395 INFO  (commitScheduler-102-thread-1)
> [c:collection1 s:shard1 r:core_node1 x:collection1]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.395 INFO  (commitScheduler-118-thread-1)
> [c:collection2 s:shard1 r:core_node1 x:collection2]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.395 INFO  (commitScheduler-110-thread-1)
> [c:collection3 s:shard1 r:core_node1 x:collection3]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.394 INFO  (commitScheduler-109-thread-1)
> [c:collection4 s:shard1 r:core_node1 x:collection4]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.394 INFO  (commitScheduler-99-thread-1)
> [c:collection5 s:shard1 r:core_node1 x:collection5]
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> >
> > 2019-04-08 12:54:01.405 ERROR
> (updateExecutor-2-thread-36358-processing-http:////
> 10.101.111.80:8983//solr//product x:product r:core_node1
> n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1
> r:core_node1 x:product] o.a.s.u.StreamingSolrClients error
> > org.apache.solr.common.SolrException: Service Unavailable
> >
> >
> >
> > request:
> http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
> > at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
> Source)
> > at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> >
> >
> > Shard-1-replica-1 Log ::
> > --------------
> > 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product
> s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor
> Request says it is coming from leader, but we are the leader:
> update.distrib=FROMLEADER&distrib.from=
> http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> > 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product
> s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory
> [product]  webapp=/solr path=/update
> params={update.distrib=FROMLEADER&distrib.from=
> http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
> > 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product
> s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase
> org.apache.solr.common.SolrException: Request says it is coming from
> leader, but we are the leader
> > at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
> > at
> org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
> > at
> org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
> > at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
> > at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> > at
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
> > at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
> > at
> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
> > at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
> > at
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
> > at
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
> > at
> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
> > at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> > at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
> > at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
> > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
> > at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> > at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
> > at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> > at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
> > at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
> > at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> > at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
> > at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> > at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> > at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> > at org.eclipse.jetty.server.Server.handle(Server.java:518)
> > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> > at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
> > at org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> > at org.eclipse.jetty.io
> .SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> > at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
> > at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
> > at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
> > at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > After this one by one all collection got down and We can not identify
> that why this ERROR came and what is meaning of that error.
> >
> > In Production my version
> > zookeeper : 3.4.6
> > java : 8
> > solrj : 5.2.0
> >
> > Please guide me...
> > From: Erick Erickson <[hidden email]>
> > Sent: Wednesday, April 10, 2019 8:35 PM
> > To: [hidden email]
> > Subject: Re: Shard and replica went down in Solr 6.1.0
> >
> > Solr will not go down due to “performance warning”. All that means is
> that you’re opening too many searchers in a short time. If you’re somehow
> opening a huge number of searchers, then maybe but IIRC you can’t really
> open more than two. And don’t think you can “fix” this by upping
> maxWarmingSearchers, that’ll only make the problem worse as every new
> searcher chews up memory that’s kept until the old searcher is done with
> outstanding requests.
> >
> > Which is weird because with those settings you shouldn’t be opening
> _any_ new searchers. So my guess is that some external client is doing that
> and this is usually an anti-pattern. Don’t do it, please. Just set your
> soft commit to what you need and leave it at that. As long as possible.
> Here are all the gory details:
> >
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >
> > I’d also not bother with maxDocs, that’s much less predictable than just
> setting a reasonable time for autocommit, a minute or two is usually
> reasonable especially when openSearcher is false.
> >
> > Not sure what’s really generating that error, take a look at all your
> other Solr logs to see if there’s a cause.
> >
> > Best,
> > Erick
> >
> >
> > > On Apr 10, 2019, at 5:21 AM, vishal patel <
> [hidden email]> wrote:
> > >
> > > I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one
> replica went down and I got below ERROR
> > >
> > > 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1)
> [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher
> Opening [Searcher@24b9127f[product1] main]
> > > 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1)
> [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy
> SolrDeletionPolicy.onCommit: commits: num=2
> > >
> commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> > >
> commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> > > 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1)
> [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy
> newest commit generation = 22542
> > > 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1)
> [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3]
> PERFORMANCE WARNING: Overlapping onDeckSearchers=2
> > >
> > > 2019-04-08 12:54:01.534 ERROR
> (updateExecutor-2-thread-36358-processing-http:////
> 10.101.111.80:8983//solr//product3 x:product3 r:core_node1
> n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1
> r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> > > org.apache.solr.common.SolrException: Service Unavailable
> > >
> > > request:
> http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> > > at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
> Source)
> > > at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > Note : product1,product2 and product3 are my collection.
> > >
> > > In my solrconfig.xml
> > > <autoCommit>
> > > <maxTime>600000</maxTime>
> > >    <maxDocs>20000</maxDocs>
> > >    <openSearcher>false</openSearcher>
> > > </autoCommit>
> > >
> > > <autoSoftCommit>
> > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > </autoSoftCommit>
> > > <maxWarmingSearchers>2</maxWarmingSearchers>
> > >
> > > There are many documents committed at that time and I found out so
> many commitScheduler threads in Log.
> > > Solr went down due to warn PERFORMANCE WARNING: Overlapping
> onDeckSearchers=2 is it possible?
> > > Need to update my autoCommit or  maxWarmingSearchers?
> > >
> > > Sent from Outlook<http://aka.ms/weboutlook>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

vishal patel-2
In reply to this post by vishal patel-2
Thanks your reply

Actually all cache are removed in my solrconfig.xml so no matter for autowarm count for us.

And I read your given link about hard commit and soft commit concept.
In my production scenario like - index-heavy, Query-heavy [Near Real Time]
So we set the hardcommit max time 10 minutes and soft commit -1 for each collection.[27 collections in production]
Is it ohk for production? Please give some input.

And shard is going down for which reason how can we know from log?

And please tell me what is meaning of below error because I can not find this type of error in any web page.

Shard-1-replica
-----------------
2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader
at

Shard-1
------------------
2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
org.apache.solr.common.SolrException: Service Unavailable


First we got error in replica then shard.

Sent from Outlook<http://aka.ms/weboutlook>
________________________________
From: Erick Erickson <[hidden email]>
Sent: Friday, April 12, 2019 12:04 AM
To: vishal patel
Subject: Re: Shard and replica went down in Solr 6.1.0

Well, it leads to “too many on deck searchers”, obviously.

Here’s most of what you want to know about commits and the like: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

I always push back on any setting that opens a new searcher every second. You might as well set all your autowarm counts in solrconfig.xml to zero, autowarming isn’t doing you any good. You’re also creating a lot of garbage churn etc.

Set your soft commit to as long as you can. If you can set expectations like “it may take up to a minute before your changes are visible"

> On Apr 11, 2019, at 10:09 AM, vishal patel <[hidden email]> wrote:
>
> Actually in our application , bulky documents are needed to index and on same time we want to see that documents. So in production we keep auto commit 10 minutes and auto soft commit 1 second.
> Is it ohk?
>
> Get Outlook for Android
>
>
>
> From: Erick Erickson <[hidden email]>
> Sent: Thursday, April 11, 2019 10:23:00 PM
> To: vishal patel
> Subject: Re: Shard and replica went down in Solr 6.1.0
>
> We’re not quite on the same page.
>
> These config options will _not_ open any new searchers. Period. They’re not the source of your max warming searchers error. Therefore _somebody_ is issuing commits. You need to find out who and stop them.
>
> Then change your settings in solrconfig to
> 1> remove maxDoc from autoCommit. It’s probably leading to useless work
> 2> I’d shorten my maxTime in autoCommit to, say, 1 minute. This isn’t very important, but 10 minutes is unnecessary
> 3> Change your autoSoftCommit to as long as your application can tolerate, say 10 minutes if possible.
> 4> Find out who is issuing commits and stop them.
>
> With these settings, unless you have outrageous “autowarmCount” settings in solrconfig.xml for the caches, you should not see any overlapping on deck searchers. I usually start with autowarmCount settings in the 10-16 range.
>
> Best,
> Erick
>
> > On Apr 11, 2019, at 5:40 AM, vishal patel <[hidden email]> wrote:
> >
> >
> > Thanks Erick,
> > I got your point. As per you, Solr will not go down due to “performance warning” and no need to change the maxdoc value. You talked about number of searchers but in solrconfig.xml there is only 2.[<maxWarmingSearchers>2</maxWarmingSearchers>]
> >
> > In production, we have 27 collection, 2 shards and 2 replicas and 3 zoo keepers and more than 30000 documents indexed within a 10 minutes.
> > How can I know how many searchers are open at the specific time?
> > As per my understanding, solr searcher will open when soft or hard commit. Am I right? And my commit time below
> > <autoCommit>
> > <maxTime>600000</maxTime>
> >     <maxDocs>20000</maxDocs>
> >     <openSearcher>false</openSearcher>
> > </autoCommit>
> >
> > <autoSoftCommit>
> > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > </autoSoftCommit>
> >
> > I do not write any code for opening the solr index searcher and can not find any error related that.
> >
> > My all collection has same configuration mean same hard and soft commit time for all 27 collection. is it any issue if on same time two or more collections will come for hard commit?
> >
> > Below, again send more and accurate log details.
> >
> > Shard-1 Log ::
> > --------------
> > 2019-04-08 12:54:01.395 INFO  (commitScheduler-102-thread-1) [c:collection1 s:shard1 r:core_node1 x:collection1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.395 INFO  (commitScheduler-118-thread-1) [c:collection2 s:shard1 r:core_node1 x:collection2] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.395 INFO  (commitScheduler-110-thread-1) [c:collection3 s:shard1 r:core_node1 x:collection3] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.394 INFO  (commitScheduler-109-thread-1) [c:collection4 s:shard1 r:core_node1 x:collection4] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > 2019-04-08 12:54:01.394 INFO  (commitScheduler-99-thread-1) [c:collection5 s:shard1 r:core_node1 x:collection5] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> >
> > 2019-04-08 12:54:01.405 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1 r:core_node1 x:product] o.a.s.u.StreamingSolrClients error
> > org.apache.solr.common.SolrException: Service Unavailable
> >
> >
> >
> > request: http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
> > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > at java.lang.Thread.run(Thread.java:745)
> >
> >
> > Shard-1-replica-1 Log ::
> > --------------
> > 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> > 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
> > 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader
> > at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
> > at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
> > at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
> > at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
> > at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> > at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
> > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
> > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
> > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
> > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
> > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
> > at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
> > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
> > at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
> > at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
> > at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> > at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
> > at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
> > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
> > at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> > at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
> > at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
> > at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
> > at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> > at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
> > at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
> > at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> > at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
> > at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> > at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> > at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> > at org.eclipse.jetty.server.Server.handle(Server.java:518)
> > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> > at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
> > at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> > at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
> > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
> > at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
> > at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > After this one by one all collection got down and We can not identify that why this ERROR came and what is meaning of that error.
> >
> > In Production my version
> > zookeeper : 3.4.6
> > java : 8
> > solrj : 5.2.0
> >
> > Please guide me...
> > From: Erick Erickson <[hidden email]>
> > Sent: Wednesday, April 10, 2019 8:35 PM
> > To: [hidden email]
> > Subject: Re: Shard and replica went down in Solr 6.1.0
> >
> > Solr will not go down due to “performance warning”. All that means is that you’re opening too many searchers in a short time. If you’re somehow opening a huge number of searchers, then maybe but IIRC you can’t really open more than two. And don’t think you can “fix” this by upping maxWarmingSearchers, that’ll only make the problem worse as every new searcher chews up memory that’s kept until the old searcher is done with outstanding requests.
> >
> > Which is weird because with those settings you shouldn’t be opening _any_ new searchers. So my guess is that some external client is doing that and this is usually an anti-pattern. Don’t do it, please. Just set your soft commit to what you need and leave it at that. As long as possible. Here are all the gory details:
> > https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >
> > I’d also not bother with maxDocs, that’s much less predictable than just setting a reasonable time for autocommit, a minute or two is usually reasonable especially when openSearcher is false.
> >
> > Not sure what’s really generating that error, take a look at all your other Solr logs to see if there’s a cause.
> >
> > Best,
> > Erick
> >
> >
> > > On Apr 10, 2019, at 5:21 AM, vishal patel <[hidden email]> wrote:
> > >
> > > I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica went down and I got below ERROR
> > >
> > > 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher Opening [Searcher@24b9127f[product1] main]
> > > 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit: commits: num=2
> > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> > > 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
> > > 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
> > >
> > > 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> > > org.apache.solr.common.SolrException: Service Unavailable
> > >
> > > request: http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> > > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > Note : product1,product2 and product3 are my collection.
> > >
> > > In my solrconfig.xml
> > > <autoCommit>
> > > <maxTime>600000</maxTime>
> > >    <maxDocs>20000</maxDocs>
> > >    <openSearcher>false</openSearcher>
> > > </autoCommit>
> > >
> > > <autoSoftCommit>
> > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > </autoSoftCommit>
> > > <maxWarmingSearchers>2</maxWarmingSearchers>
> > >
> > > There are many documents committed at that time and I found out so many commitScheduler threads in Log.
> > > Solr went down due to warn PERFORMANCE WARNING: Overlapping onDeckSearchers=2 is it possible?
> > > Need to update my autoCommit or  maxWarmingSearchers?
> > >
> > > Sent from Outlook<http://aka.ms/weboutlook>

Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

vishal patel-2

Thanks for your reply.

2> In production, lots of documents come for indexing within a second.If i do hard commit interval to 60 seconds then in less times open searchers when hard commit execute. Is it ohk for performance?
3> soft commit 60 second we can not do as of now because our product implement like NRT after indexing instant show that changes.

Actually I have more analysed my production issue logs ::
Replica becomes leader and after some times some documents come for indexing in new leader as a understanding of previously replica.

My production log :

Shard Log(10.102.119.85) :

2019-04-08 12:54:01.405 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1 r:core_node1 x:product] o.a.s.u.StreamingSolrClients error
org.apache.solr.common.SolrException: Service Unavailable
request: http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Replica Log(10.101.111.80) :

2019-04-08 12:52:19.717 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] o.a.s.c.c.ZkStateReader A live node change: [WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes], has occurred - updating... (live nodes size: [4])
2019-04-08 12:52:19.725 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext Running the leader process for shard=shard1 and weAreReplacement=true and leaderVoteWait=180000
2019-04-08 12:52:19.727 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3)
2019-04-08 12:52:19.764 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext Checking if I should try and be the leader.
2019-04-08 12:52:19.765 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext My last published State was Active, it's okay to be the leader.
2019-04-08 12:52:19.766 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext I may be the new leader - try and sync
2019-04-08 12:52:19.767 WARN  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.RecoveryStrategy Stopping recovery for core=[product] coreNodeName=[core_node3]
2019-04-08 12:52:20.291 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json], has occurred - updating... (live nodes size: [3])
2019-04-08 12:52:20.532 INFO  (zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json], has occurred - updating... (live nodes size: [3])
2019-04-08 12:52:22.274 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy Sync replicas to http://10.101.111.80:8983/solr/product/
2019-04-08 12:52:22.274 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy Sync Success - now sync replicas to me
2019-04-08 12:52:22.274 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy http://10.101.111.80:8983/solr/product/ has no replicas
2019-04-08 12:52:22.275 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContextBase Creating leader registration node /collections/product/leaders/shard1/leader after winning as /collections/product/leader_elect/shard1/election/245953614547976270-core_node3-n_0000000001
2019-04-08 12:52:22.384 INFO  (zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext I am the new leader: http://10.101.111.80:8983/solr/product/ shard1

2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader
at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)


First for which reason replica becomes leader? and if replica becomes leader then why after 2 minutes document come in new leader as a behave it is still replica? We do index document using zoo keeper instead of direct on shard.

Sent from Outlook<http://aka.ms/weboutlook>
________________________________
From: Erick Erickson <[hidden email]>
Sent: Friday, April 12, 2019 8:52 PM
To: vishal patel
Subject: Re: Shard and replica went down in Solr 6.1.0

What was unclear about the statement "These config options will _not_ open any new searchers. Period. “?

You _cannot_ be opening new searchers automatically, some client somewhere _must_ be issuing a commit. And you shouldn’t do that. Until you find the external source that’s issuing the commit, you’ll make no progress on this.

Is this an explicit enough recommendation?

1> find the source of the commits and stop doing that.
2> change your hard commit interval to 60 seconds. Keep openSearcher set to false.
3> change your soft commit to 60 seconds (this will imply that you can’t search docs for a minute).

This may or may not be the root of the “Request says it is coming from the leader” message. But it’s certainly a contributing factor. We can identify this positively as a bad configuration so let’s eliminate it and _then_ look for more issues.

Best,
Erick

> On Apr 11, 2019, at 10:36 PM, vishal patel <[hidden email]> wrote:
>
> Thanks your reply
>
> Actually all cache are removed in my solrconfig.xml so no matter for autowarm count for us.
>
> And I read your given link about hard commit and soft commit concept.
> In my production scenario like - index-heavy, Query-heavy [Near Real Time]
> So we set the hardcommit max time 10 minutes and soft commit -1 for each collection.[27 collections in production]
> Is it ohk for production? Please give some input.
>
> And shard is going down for which reason how can we know from log?
>
> And please tell me what is meaning of below error because I can not find this type of error in any web page.
>
> Shard-1-replica
> -----------------
> 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
> 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader
> at
>
> Shard-1
> ------------------
> 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> org.apache.solr.common.SolrException: Service Unavailable
>
>
> First we got error in replica then shard.
>
> Sent from Outlook
> From: Erick Erickson <[hidden email]>
> Sent: Friday, April 12, 2019 12:04 AM
> To: vishal patel
> Subject: Re: Shard and replica went down in Solr 6.1.0
>
> Well, it leads to “too many on deck searchers”, obviously.
>
> Here’s most of what you want to know about commits and the like: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> I always push back on any setting that opens a new searcher every second. You might as well set all your autowarm counts in solrconfig.xml to zero, autowarming isn’t doing you any good. You’re also creating a lot of garbage churn etc.
>
> Set your soft commit to as long as you can. If you can set expectations like “it may take up to a minute before your changes are visible"
>
> > On Apr 11, 2019, at 10:09 AM, vishal patel <[hidden email]> wrote:
> >
> > Actually in our application , bulky documents are needed to index and on same time we want to see that documents. So in production we keep auto commit 10 minutes and auto soft commit 1 second.
> > Is it ohk?
> >
> > Get Outlook for Android
> >
> >
> >
> > From: Erick Erickson <[hidden email]>
> > Sent: Thursday, April 11, 2019 10:23:00 PM
> > To: vishal patel
> > Subject: Re: Shard and replica went down in Solr 6.1.0
> >
> > We’re not quite on the same page.
> >
> > These config options will _not_ open any new searchers. Period. They’re not the source of your max warming searchers error. Therefore _somebody_ is issuing commits. You need to find out who and stop them.
> >
> > Then change your settings in solrconfig to
> > 1> remove maxDoc from autoCommit. It’s probably leading to useless work
> > 2> I’d shorten my maxTime in autoCommit to, say, 1 minute. This isn’t very important, but 10 minutes is unnecessary
> > 3> Change your autoSoftCommit to as long as your application can tolerate, say 10 minutes if possible.
> > 4> Find out who is issuing commits and stop them.
> >
> > With these settings, unless you have outrageous “autowarmCount” settings in solrconfig.xml for the caches, you should not see any overlapping on deck searchers. I usually start with autowarmCount settings in the 10-16 range.
> >
> > Best,
> > Erick
> >
> > > On Apr 11, 2019, at 5:40 AM, vishal patel <[hidden email]> wrote:
> > >
> > >
> > > Thanks Erick,
> > > I got your point. As per you, Solr will not go down due to “performance warning” and no need to change the maxdoc value. You talked about number of searchers but in solrconfig.xml there is only 2.[<maxWarmingSearchers>2</maxWarmingSearchers>]
> > >
> > > In production, we have 27 collection, 2 shards and 2 replicas and 3 zoo keepers and more than 30000 documents indexed within a 10 minutes.
> > > How can I know how many searchers are open at the specific time?
> > > As per my understanding, solr searcher will open when soft or hard commit. Am I right? And my commit time below
> > > <autoCommit>
> > > <maxTime>600000</maxTime>
> > >     <maxDocs>20000</maxDocs>
> > >     <openSearcher>false</openSearcher>
> > > </autoCommit>
> > >
> > > <autoSoftCommit>
> > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > </autoSoftCommit>
> > >
> > > I do not write any code for opening the solr index searcher and can not find any error related that.
> > >
> > > My all collection has same configuration mean same hard and soft commit time for all 27 collection. is it any issue if on same time two or more collections will come for hard commit?
> > >
> > > Below, again send more and accurate log details.
> > >
> > > Shard-1 Log ::
> > > --------------
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-102-thread-1) [c:collection1 s:shard1 r:core_node1 x:collection1] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-118-thread-1) [c:collection2 s:shard1 r:core_node1 x:collection2] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-110-thread-1) [c:collection3 s:shard1 r:core_node1 x:collection3] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.394 INFO  (commitScheduler-109-thread-1) [c:collection4 s:shard1 r:core_node1 x:collection4] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.394 INFO  (commitScheduler-99-thread-1) [c:collection5 s:shard1 r:core_node1 x:collection5] o.a.s.u.DirectUpdateHandler2 start commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > >
> > > 2019-04-08 12:54:01.405 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) [c:product s:shard1 r:core_node1 x:product] o.a.s.u.StreamingSolrClients error
> > > org.apache.solr.common.SolrException: Service Unavailable
> > >
> > >
> > >
> > > request: http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
> > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> > > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > > Shard-1-replica-1 Log ::
> > > --------------
> > > 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is coming from leader, but we are the leader: update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> > > 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{} 0 0
> > > 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 r:core_node3 x:product] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Request says it is coming from leader, but we are the leader
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
> > > at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
> > > at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> > > at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
> > > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
> > > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
> > > at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
> > > at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
> > > at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
> > > at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
> > > at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> > > at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
> > > at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
> > > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
> > > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
> > > at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> > > at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
> > > at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
> > > at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
> > > at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > > at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> > > at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
> > > at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
> > > at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> > > at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
> > > at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> > > at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> > > at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> > > at org.eclipse.jetty.server.Server.handle(Server.java:518)
> > > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> > > at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
> > > at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> > > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> > > at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> > > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
> > > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
> > > at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
> > > at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > After this one by one all collection got down and We can not identify that why this ERROR came and what is meaning of that error.
> > >
> > > In Production my version
> > > zookeeper : 3.4.6
> > > java : 8
> > > solrj : 5.2.0
> > >
> > > Please guide me...
> > > From: Erick Erickson <[hidden email]>
> > > Sent: Wednesday, April 10, 2019 8:35 PM
> > > To: [hidden email]
> > > Subject: Re: Shard and replica went down in Solr 6.1.0
> > >
> > > Solr will not go down due to “performance warning”. All that means is that you’re opening too many searchers in a short time. If you’re somehow opening a huge number of searchers, then maybe but IIRC you can’t really open more than two. And don’t think you can “fix” this by upping maxWarmingSearchers, that’ll only make the problem worse as every new searcher chews up memory that’s kept until the old searcher is done with outstanding requests.
> > >
> > > Which is weird because with those settings you shouldn’t be opening _any_ new searchers. So my guess is that some external client is doing that and this is usually an anti-pattern. Don’t do it, please. Just set your soft commit to what you need and leave it at that. As long as possible. Here are all the gory details:
> > > https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> > >
> > > I’d also not bother with maxDocs, that’s much less predictable than just setting a reasonable time for autocommit, a minute or two is usually reasonable especially when openSearcher is false.
> > >
> > > Not sure what’s really generating that error, take a look at all your other Solr logs to see if there’s a cause.
> > >
> > > Best,
> > > Erick
> > >
> > >
> > > > On Apr 10, 2019, at 5:21 AM, vishal patel <[hidden email]> wrote:
> > > >
> > > > I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica went down and I got below ERROR
> > > >
> > > > 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher Opening [Searcher@24b9127f[product1] main]
> > > > 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit: commits: num=2
> > > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> > > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> > > > 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) [c:product2 s:shard1 r:core_node1 x:product2] o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
> > > > 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] PERFORMANCE WARNING: Overlapping onDeckSearchers=2
> > > >
> > > > 2019-04-08 12:54:01.534 ERROR (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3 x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients error
> > > > org.apache.solr.common.SolrException: Service Unavailable
> > > >
> > > > request: http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> > > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > > at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown Source)
> > > > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > > at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > Note : product1,product2 and product3 are my collection.
> > > >
> > > > In my solrconfig.xml
> > > > <autoCommit>
> > > > <maxTime>600000</maxTime>
> > > >    <maxDocs>20000</maxDocs>
> > > >    <openSearcher>false</openSearcher>
> > > > </autoCommit>
> > > >
> > > > <autoSoftCommit>
> > > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > > </autoSoftCommit>
> > > > <maxWarmingSearchers>2</maxWarmingSearchers>
> > > >
> > > > There are many documents committed at that time and I found out so many commitScheduler threads in Log.
> > > > Solr went down due to warn PERFORMANCE WARNING: Overlapping onDeckSearchers=2 is it possible?
> > > > Need to update my autoCommit or  maxWarmingSearchers?
> > > >
> > > > Sent from Outlook<http://aka.ms/weboutlook>

Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

Shawn Heisey-2
On 4/13/2019 9:29 PM, vishal patel wrote:
> 2> In production, lots of documents come for indexing within a second.If i do hard commit interval to 60 seconds then in less times open searchers when hard commit execute. Is it ohk for performance?

The autoCommit configuration should have openSearcher set to false.
That way there will be no searchers opening on the hard commit.  It is
opening the searcher that is expensive, so doing hard commits without
opening a new searcher is normally VERY fast and uses very few
resources.  When openSearcher is false, that commit will NOT make index
changes visible.

> 3> soft commit 60 second we can not do as of now because our product implement like NRT after indexing instant show that changes.

As I just said, it is opening the searcher that is expensive.  If that
happens extremely frequently, which it would have to in order to achieve
instant NRT, that's going to be opening a LOT of new searchers.  This
will absolutely kill performance.  If you remove all cache warming, it
will be better, but probably still bad.

Unless the index is extremely small, it is not usually possible to
achieve a goal of "documents must be visible within one second after
indexing".

If you are getting the "Overlapping onDecSearchers" warning, it means
you have commits that open a new searcher happening too quickly.  Let's
say that it takes five seconds to finish a commit that opens a new
searcher.  I do not know how long it takes on your index, but I've seen
it take much longer, so that's the example I'm going with.  If you have
such commits happening once a second, then you'll the warning will be
logged because you will end up with five new searchers opening at the
same time,  Running them all simultaneously might make them take even
longer -- the problem compounds itself.

I don't have any idea why you're having replicas go down, other than
maybe this:  All the commit activity (opening new searchers) might be
keeping the system so busy that queries are taking long enough to reach
timeouts.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Shard and replica went down in Solr 6.1.0

vishal patel-2
Thanks for your reply.....


Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Shawn Heisey <[hidden email]>
Sent: Monday, April 15, 2019 12:40:59 AM
To: [hidden email]
Subject: Re: Shard and replica went down in Solr 6.1.0

On 4/13/2019 9:29 PM, vishal patel wrote:
> 2> In production, lots of documents come for indexing within a second.If i do hard commit interval to 60 seconds then in less times open searchers when hard commit execute. Is it ohk for performance?

The autoCommit configuration should have openSearcher set to false.
That way there will be no searchers opening on the hard commit.  It is
opening the searcher that is expensive, so doing hard commits without
opening a new searcher is normally VERY fast and uses very few
resources.  When openSearcher is false, that commit will NOT make index
changes visible.

> 3> soft commit 60 second we can not do as of now because our product implement like NRT after indexing instant show that changes.

As I just said, it is opening the searcher that is expensive.  If that
happens extremely frequently, which it would have to in order to achieve
instant NRT, that's going to be opening a LOT of new searchers.  This
will absolutely kill performance.  If you remove all cache warming, it
will be better, but probably still bad.

Unless the index is extremely small, it is not usually possible to
achieve a goal of "documents must be visible within one second after
indexing".

If you are getting the "Overlapping onDecSearchers" warning, it means
you have commits that open a new searcher happening too quickly.  Let's
say that it takes five seconds to finish a commit that opens a new
searcher.  I do not know how long it takes on your index, but I've seen
it take much longer, so that's the example I'm going with.  If you have
such commits happening once a second, then you'll the warning will be
logged because you will end up with five new searchers opening at the
same time,  Running them all simultaneously might make them take even
longer -- the problem compounds itself.

I don't have any idea why you're having replicas go down, other than
maybe this:  All the commit activity (opening new searchers) might be
keeping the system so busy that queries are taking long enough to reach
timeouts.

Thanks,
Shawn