Solr Hanging

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr Hanging

"Trym R. Møller"
Hi

I am using Solr trunk and have 7 Solr instances running with 28 leaders
and 28 replicas for a single collection.
After indexing a while (a couple of days) the solrs start hanging and
doing a thread dump on the jvm I see blocked threads like the following:
     Thread 2369: (state = BLOCKED)
      - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
information may be imprecise)
      - java.util.concurrent.locks.LockSupport.park(java.lang.Object)
@bci=14, line=158 (Compiled frame)
      -
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
@bci=42, line=1987 (Compiled frame)
      - java.util.concurrent.LinkedBlockingQueue.take() @bci=29,
line=399 (Compiled frame)
      - java.util.concurrent.ExecutorCompletionService.take() @bci=4,
line=164 (Compiled frame)
      -
org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean)
@bci=27, line=350 (Compiled frame)
      - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18,
line=98 (Compiled frame)
      -
org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish()
@bci=4, line=299 (Compiled frame)
      -
org.apache.solr.update.processor.DistributedUpdateProcessor.finish()
@bci=1, line=817 (Compiled frame)
     ...
      - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25,
line=582 (Interpreted frame)

I read the stack trace as my indexing client has indexed a document and
this Solr is now waiting for the replica? to respond before returning an
answer to the client.

The other Solrs have similar blocked threads.

Any ideas of how I can get closer to the problem? Am I reading the stack
trace correctly? Any further information that are relevant for
commenting this problem?

Thanks for any comments.

Best regards Trym
Reply | Threaded
Open this post in threaded view
|

Re: Solr Hanging

Yonik Seeley-2-2
On Thu, Apr 19, 2012 at 4:25 AM, "Trym R. Møller" <[hidden email]> wrote:

> Hi
>
> I am using Solr trunk and have 7 Solr instances running with 28 leaders and
> 28 replicas for a single collection.
> After indexing a while (a couple of days) the solrs start hanging and doing
> a thread dump on the jvm I see blocked threads like the following:
>    Thread 2369: (state = BLOCKED)
>     - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
> information may be imprecise)
>     - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
> line=158 (Compiled frame)
>     -
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
> @bci=42, line=1987 (Compiled frame)
>     - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399
> (Compiled frame)
>     - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164
> (Compiled frame)
>     - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean)
> @bci=27, line=350 (Compiled frame)
>     - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98
> (Compiled frame)
>     - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish()
> @bci=4, line=299 (Compiled frame)
>     - org.apache.solr.update.processor.DistributedUpdateProcessor.finish()
> @bci=1, line=817 (Compiled frame)
>    ...
>     - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582
> (Interpreted frame)
>
> I read the stack trace as my indexing client has indexed a document and this
> Solr is now waiting for the replica? to respond before returning an answer
> to the client.

Correct.  What's the full stack trace like on both a leader and replica?
We need to know what the replica is blocking on.

What version of trunk are you using?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10
Reply | Threaded
Open this post in threaded view
|

Re: Solr Hanging

"Trym R. Møller"
Thanks for your answer.

I am running an (older) revision of solr from around the 29/2-2012

I suspect that the thread I have included is the leader of the shard?
The Solr instance, I have the dump from, contains more than one leader,
so I don't know which shard (slice) the thread is working on. How can I
find the solr instance containing the replica (I guess ZooKeeper can't
help me)?
And when I have found the solr instance containing the replica, how do I
know which thread is handling the update request (all my solr instances
contains 8 cores)?

If this is not possible, I might be able to restart with a setup where
each Solr instances only contains a single core (a leader or a replica).

Best regards Trym

Den 19-04-2012 14:36, Yonik Seeley skrev:

> On Thu, Apr 19, 2012 at 4:25 AM, "Trym R. Møller"<[hidden email]>  wrote:
>> Hi
>>
>> I am using Solr trunk and have 7 Solr instances running with 28 leaders and
>> 28 replicas for a single collection.
>> After indexing a while (a couple of days) the solrs start hanging and doing
>> a thread dump on the jvm I see blocked threads like the following:
>>     Thread 2369: (state = BLOCKED)
>>      - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
>> information may be imprecise)
>>      - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
>> line=158 (Compiled frame)
>>      -
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
>> @bci=42, line=1987 (Compiled frame)
>>      - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399
>> (Compiled frame)
>>      - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164
>> (Compiled frame)
>>      - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean)
>> @bci=27, line=350 (Compiled frame)
>>      - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98
>> (Compiled frame)
>>      - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish()
>> @bci=4, line=299 (Compiled frame)
>>      - org.apache.solr.update.processor.DistributedUpdateProcessor.finish()
>> @bci=1, line=817 (Compiled frame)
>>     ...
>>      - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582
>> (Interpreted frame)
>>
>> I read the stack trace as my indexing client has indexed a document and this
>> Solr is now waiting for the replica? to respond before returning an answer
>> to the client.
> Correct.  What's the full stack trace like on both a leader and replica?
> We need to know what the replica is blocking on.
>
> What version of trunk are you using?
>
> -Yonik
> lucenerevolution.com - Lucene/Solr Open Source Search Conference.
> Boston May 7-10
Reply | Threaded
Open this post in threaded view
|

Re: Solr Hanging

"Trym R. Møller"
In reply to this post by Yonik Seeley-2-2
Hi

I have succeeded in reproducing the scenario with two Solr instances
running. They cover a single collection with two slices and two replica,
two cores in each Solr instance. I have changed the number of threads
that Jetty is allowed to use as follows:
<New class="org.mortbay.thread.QueuedThreadPool">
<Set name="minThreads">3</Set>
<Set name="maxThreads">3</Set>
<Set name="lowThreads">0</Set>
</New>
And when indexing a single document this works fine but when
concurrently indexing 10 documents, Solr frequently hangs.
I know that Jetty per default are allowed to use 10.000 threads, but in
my other setup, all these 10.000 allowed thread are used on a single
Solr instance (I have 7 Solr instances) after some days and the hanging
scenario occurs.

I'm not sure if just adjusting the allowed number of threads are the
best solution and would like to get some input as what to expect and if
there are other things I can adjust.
My setup is as written before 7 Solr instances handling a single
collection with 28 leaders and 28 replicas distributed fairly on the
Solrs (8 cores on each Solr).

Thanks for any input.

Best regards Trym


Den 19-04-2012 14:36, Yonik Seeley skrev:

> On Thu, Apr 19, 2012 at 4:25 AM, "Trym R. Møller"<[hidden email]>  wrote:
>> Hi
>>
>> I am using Solr trunk and have 7 Solr instances running with 28 leaders and
>> 28 replicas for a single collection.
>> After indexing a while (a couple of days) the solrs start hanging and doing
>> a thread dump on the jvm I see blocked threads like the following:
>>     Thread 2369: (state = BLOCKED)
>>      - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
>> information may be imprecise)
>>      - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
>> line=158 (Compiled frame)
>>      -
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
>> @bci=42, line=1987 (Compiled frame)
>>      - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399
>> (Compiled frame)
>>      - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164
>> (Compiled frame)
>>      - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean)
>> @bci=27, line=350 (Compiled frame)
>>      - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98
>> (Compiled frame)
>>      - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish()
>> @bci=4, line=299 (Compiled frame)
>>      - org.apache.solr.update.processor.DistributedUpdateProcessor.finish()
>> @bci=1, line=817 (Compiled frame)
>>     ...
>>      - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582
>> (Interpreted frame)
>>
>> I read the stack trace as my indexing client has indexed a document and this
>> Solr is now waiting for the replica? to respond before returning an answer
>> to the client.
> Correct.  What's the full stack trace like on both a leader and replica?
> We need to know what the replica is blocking on.
>
> What version of trunk are you using?
>
> -Yonik
> lucenerevolution.com - Lucene/Solr Open Source Search Conference.
> Boston May 7-10
Reply | Threaded
Open this post in threaded view
|

Re: Solr Hanging

Mark Miller-3
Perhaps related is http://www.lucidimagination.com/search/document/6d0e168c82c86a38#45c945b2de6543f4

On Apr 23, 2012, at 5:37 AM, Trym R. Møller wrote:

> Hi
>
> I have succeeded in reproducing the scenario with two Solr instances running. They cover a single collection with two slices and two replica, two cores in each Solr instance. I have changed the number of threads that Jetty is allowed to use as follows:
> <New class="org.mortbay.thread.QueuedThreadPool">
> <Set name="minThreads">3</Set>
> <Set name="maxThreads">3</Set>
> <Set name="lowThreads">0</Set>
> </New>
> And when indexing a single document this works fine but when concurrently indexing 10 documents, Solr frequently hangs.
> I know that Jetty per default are allowed to use 10.000 threads, but in my other setup, all these 10.000 allowed thread are used on a single Solr instance (I have 7 Solr instances) after some days and the hanging scenario occurs.
>
> I'm not sure if just adjusting the allowed number of threads are the best solution and would like to get some input as what to expect and if there are other things I can adjust.
> My setup is as written before 7 Solr instances handling a single collection with 28 leaders and 28 replicas distributed fairly on the Solrs (8 cores on each Solr).
>
> Thanks for any input.
>
> Best regards Trym
>
>
> Den 19-04-2012 14:36, Yonik Seeley skrev:
>> On Thu, Apr 19, 2012 at 4:25 AM, "Trym R. Møller"<[hidden email]>  wrote:
>>> Hi
>>>
>>> I am using Solr trunk and have 7 Solr instances running with 28 leaders and
>>> 28 replicas for a single collection.
>>> After indexing a while (a couple of days) the solrs start hanging and doing
>>> a thread dump on the jvm I see blocked threads like the following:
>>>    Thread 2369: (state = BLOCKED)
>>>     - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
>>> information may be imprecise)
>>>     - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
>>> line=158 (Compiled frame)
>>>     -
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
>>> @bci=42, line=1987 (Compiled frame)
>>>     - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399
>>> (Compiled frame)
>>>     - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164
>>> (Compiled frame)
>>>     - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean)
>>> @bci=27, line=350 (Compiled frame)
>>>     - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98
>>> (Compiled frame)
>>>     - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish()
>>> @bci=4, line=299 (Compiled frame)
>>>     - org.apache.solr.update.processor.DistributedUpdateProcessor.finish()
>>> @bci=1, line=817 (Compiled frame)
>>>    ...
>>>     - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582
>>> (Interpreted frame)
>>>
>>> I read the stack trace as my indexing client has indexed a document and this
>>> Solr is now waiting for the replica? to respond before returning an answer
>>> to the client.
>> Correct.  What's the full stack trace like on both a leader and replica?
>> We need to know what the replica is blocking on.
>>
>> What version of trunk are you using?
>>
>> -Yonik
>> lucenerevolution.com - Lucene/Solr Open Source Search Conference.
>> Boston May 7-10

- Mark Miller
lucidimagination.com











Reply | Threaded
Open this post in threaded view
|

Re: Solr Hanging

Mark Miller-3
And see https://issues.apache.org/jira/browse/SOLR-683 as it also may be related or have helpful info...

On Apr 23, 2012, at 8:17 AM, Mark Miller wrote:

> Perhaps related is http://www.lucidimagination.com/search/document/6d0e168c82c86a38#45c945b2de6543f4
>
> On Apr 23, 2012, at 5:37 AM, Trym R. Møller wrote:
>
>> Hi
>>
>> I have succeeded in reproducing the scenario with two Solr instances running. They cover a single collection with two slices and two replica, two cores in each Solr instance. I have changed the number of threads that Jetty is allowed to use as follows:
>> <New class="org.mortbay.thread.QueuedThreadPool">
>> <Set name="minThreads">3</Set>
>> <Set name="maxThreads">3</Set>
>> <Set name="lowThreads">0</Set>
>> </New>
>> And when indexing a single document this works fine but when concurrently indexing 10 documents, Solr frequently hangs.
>> I know that Jetty per default are allowed to use 10.000 threads, but in my other setup, all these 10.000 allowed thread are used on a single Solr instance (I have 7 Solr instances) after some days and the hanging scenario occurs.
>>
>> I'm not sure if just adjusting the allowed number of threads are the best solution and would like to get some input as what to expect and if there are other things I can adjust.
>> My setup is as written before 7 Solr instances handling a single collection with 28 leaders and 28 replicas distributed fairly on the Solrs (8 cores on each Solr).
>>
>> Thanks for any input.
>>
>> Best regards Trym
>>
>>
>> Den 19-04-2012 14:36, Yonik Seeley skrev:
>>> On Thu, Apr 19, 2012 at 4:25 AM, "Trym R. Møller"<[hidden email]>  wrote:
>>>> Hi
>>>>
>>>> I am using Solr trunk and have 7 Solr instances running with 28 leaders and
>>>> 28 replicas for a single collection.
>>>> After indexing a while (a couple of days) the solrs start hanging and doing
>>>> a thread dump on the jvm I see blocked threads like the following:
>>>>   Thread 2369: (state = BLOCKED)
>>>>    - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
>>>> information may be imprecise)
>>>>    - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
>>>> line=158 (Compiled frame)
>>>>    -
>>>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await()
>>>> @bci=42, line=1987 (Compiled frame)
>>>>    - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=399
>>>> (Compiled frame)
>>>>    - java.util.concurrent.ExecutorCompletionService.take() @bci=4, line=164
>>>> (Compiled frame)
>>>>    - org.apache.solr.update.SolrCmdDistributor.checkResponses(boolean)
>>>> @bci=27, line=350 (Compiled frame)
>>>>    - org.apache.solr.update.SolrCmdDistributor.finish() @bci=18, line=98
>>>> (Compiled frame)
>>>>    - org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish()
>>>> @bci=4, line=299 (Compiled frame)
>>>>    - org.apache.solr.update.processor.DistributedUpdateProcessor.finish()
>>>> @bci=1, line=817 (Compiled frame)
>>>>   ...
>>>>    - org.mortbay.thread.QueuedThreadPool$PoolThread.run() @bci=25, line=582
>>>> (Interpreted frame)
>>>>
>>>> I read the stack trace as my indexing client has indexed a document and this
>>>> Solr is now waiting for the replica? to respond before returning an answer
>>>> to the client.
>>> Correct.  What's the full stack trace like on both a leader and replica?
>>> We need to know what the replica is blocking on.
>>>
>>> What version of trunk are you using?
>>>
>>> -Yonik
>>> lucenerevolution.com - Lucene/Solr Open Source Search Conference.
>>> Boston May 7-10
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>

- Mark Miller
lucidimagination.com