Re: Solr reload process flow

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Solr reload process flow

Vadim Ivanov
Hi!
(Solr 7.6 ,  Tlog replicas)
I have an issue while reloading collection with 100 shards and 3 replicas per shard residing on 5 nodes.
Configuration of that collection is pretty complex (90 external file fields)
When node starts cores load always successfully.

When I reload collection with collection api command: /admin/collections?action=RELOAD&name=col
all 5 nodes stop responding and I have dead cluster. Only restarting solr on all nodes revives it.

When I decreased number of shards/cores by 5 times (to 20 shards instead of 100)  Collection reloaded successfully.
My guess is that during Collection RELOAD , limit on threads is not honored and all cores try to reload simultaneously.

Erick wrote here ( http://lucene.472066.n3.nabble.com/collection-reload-leads-to-OutOfMemoryError-td4380754.html#a4380791 )
➢ There are a limited number of threads that load in parallel when
➢ starting up, depends on the configuration. The defaults are 3 threads
➢ in stand-alone and 8 in Cloud (see: NodeConfig.java)

➢ public static final int DEFAULT_CORE_LOAD_THREADS = 3;
➢ public static final int DEFAULT_CORE_LOAD_THREADS_IN_CLOUD = 8;

But unfortunately  stumbling about source I can't find out the place and approve
whether these "threads limit" plays any role in reload collection or not...   though I lack the necessary skills in java
Maybe somebody can give a hint where to look?

There was discussion here as well
http://lucene.472066.n3.nabble.com/Solr-reload-process-flow-td4379966.html#none
--
Vadim


Reply | Threaded
Open this post in threaded view
|

Re: Solr reload process flow

Erick Erickson
You can set it in solr.xml, see:
http://lucene.apache.org/solr/guide/7_6/format-of-solr-xml.html if
you'd like to
experiment with bumping it higher.

As for "stumbling around in the code" what this _sounds_ like is some
kind of deadlock and bumping the
number of threads would be, at best, a band-aid....

There are two things I'd try:

1> assuming you're on a *nix system, make sure you've bumped your
ulimit for processes and files. Theres a warning when Solr
starts up about the limits for both processes and open files, both
should be quite high (65K).

2> take a thread dump of Solr when it's stuck, here's a pretty good
overview: https://dzone.com/articles/how-analyze-java-thread-dumps
What you're looking for in particular is DEADLOCK status codes.
That'll give you a stack trace of exactly any threads
that are deadlocked and indicate where to start looking. If you do
find deadlocked threads _and_ you have bumped your ulimits, it's
probably worth a JIRA...

And, of course, take a look at your Solr logs to see if there are any
clues there.

Best,
Erick

On Thu, Dec 27, 2018 at 1:51 AM Vadim Ivanov
<[hidden email]> wrote:

>
> Hi!
> (Solr 7.6 ,  Tlog replicas)
> I have an issue while reloading collection with 100 shards and 3 replicas per shard residing on 5 nodes.
> Configuration of that collection is pretty complex (90 external file fields)
> When node starts cores load always successfully.
>
> When I reload collection with collection api command: /admin/collections?action=RELOAD&name=col
> all 5 nodes stop responding and I have dead cluster. Only restarting solr on all nodes revives it.
>
> When I decreased number of shards/cores by 5 times (to 20 shards instead of 100)  Collection reloaded successfully.
> My guess is that during Collection RELOAD , limit on threads is not honored and all cores try to reload simultaneously.
>
> Erick wrote here ( http://lucene.472066.n3.nabble.com/collection-reload-leads-to-OutOfMemoryError-td4380754.html#a4380791 )
> ➢ There are a limited number of threads that load in parallel when
> ➢ starting up, depends on the configuration. The defaults are 3 threads
> ➢ in stand-alone and 8 in Cloud (see: NodeConfig.java)
> ➢
> ➢ public static final int DEFAULT_CORE_LOAD_THREADS = 3;
> ➢ public static final int DEFAULT_CORE_LOAD_THREADS_IN_CLOUD = 8;
>
> But unfortunately  stumbling about source I can't find out the place and approve
> whether these "threads limit" plays any role in reload collection or not...   though I lack the necessary skills in java
> Maybe somebody can give a hint where to look?
>
> There was discussion here as well
> http://lucene.472066.n3.nabble.com/Solr-reload-process-flow-td4379966.html#none
> --
> Vadim
>
>