Solr 6.6 is trying to loading *some* (not all) cores more than once

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Solr 6.6 is trying to loading *some* (not all) cores more than once

Shawn Heisey-2
I have a situation at work where a dev system that I have just upgraded
from 6.3.0 to 6.6.0 is trying to load a small number of cores more than
once.  It's not always the same cores on a restart -- sometimes the list
is different, and isn't always the same number of cores.

The affected cores are working perfectly -- the first load appears to
have no issues and is not affected by the second (failed) load attempt.

Grepping for "inc_1" in the solr log (the final directory in the
instanceDir of a core that was affected on that restart), I see that the
core "inclive" is discovered and loaded, then a short time later, Solr
attempts to load the same core again.

2017-07-13 15:54:10.168 INFO  (coreLoadExecutor-6-thread-1) [   ]
o.a.s.c.CoreContainer Creating SolrCore 'inclive' using configuration
from instancedir /index/solr6/data/cores/inc_1, trusted=true
2017-07-13 15:54:10.168 INFO  (coreLoadExecutor-6-thread-1) [   ]
o.a.s.c.SolrCore [[inclive] ] Opening new SolrCore at
[/index/solr6/data/cores/inc_1],
dataDir=[/index/solr6/data/cores/inc_1/../../data/inc_1/]
2017-07-13 15:54:10.362 INFO  (coreLoadExecutor-6-thread-1) [   ]
o.a.s.r.ManagedResourceStorage File-based storage initialized to use
dir: /index/solr6/data/cores/inc_1/conf
2017-07-13 15:54:10.363 INFO  (coreLoadExecutor-6-thread-1) [   ]
o.a.s.r.ManagedResourceStorage Loaded null at path _rest_managed.json
using file:dir=/index/solr6/data/cores/inc_1/conf
2017-07-13 15:54:10.640 INFO  (qtp120960120-22) [   x:inclive]
o.a.s.c.CoreContainer Creating SolrCore 'inclive' using configuration
from instancedir /index/solr6/data/cores/inc_1, trusted=true
2017-07-13 15:54:10.641 INFO  (qtp120960120-22) [   x:inclive]
o.a.s.c.SolrCore [[inclive] ] Opening new SolrCore at
[/index/solr6/data/cores/inc_1],
dataDir=[/index/solr6/data/cores/inc_1/../../data/inc_1/]
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held
by this virtual machine: /index/solr6/data/data/inc_1/index/write.lock

I have only defined the solr home, not the coreRootDirectory.  Here is
the first part of the log where it shows the primary configuration
information:

2017-07-13 15:54:06.231 INFO  (main) [   ] o.e.j.s.Server
jetty-9.3.14.v20161028
2017-07-13 15:54:06.573 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
___      _       Welcome to Apache Solr™ version 6.6.0
2017-07-13 15:54:06.574 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter /
__| ___| |_ _   Starting in standalone mode on port 8982
2017-07-13 15:54:06.574 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
\__ \/ _ \ | '_|  Install dir: /opt/solr6
2017-07-13 15:54:06.590 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
|___/\___/_|_|    Start time: 2017-07-13T15:54:06.575Z
2017-07-13 15:54:06.608 INFO  (main) [   ] o.a.s.c.SolrResourceLoader
Using system property solr.solr.home: /index/solr6/data
2017-07-13 15:54:06.616 INFO  (main) [   ] o.a.s.c.SolrXmlConfig Loading
container configuration from /index/solr6/data/solr.xml
2017-07-13 15:54:06.745 INFO  (main) [   ] o.a.s.c.SolrResourceLoader
[null] Added 8 libs to classloader, from paths: [/index/solr6/data/lib]
2017-07-13 15:54:07.244 INFO  (main) [   ] o.a.s.u.UpdateShardHandler
Creating UpdateShardHandler HTTP client with params:
socketTimeout=600000&connTimeout=60000&retry=true
2017-07-13 15:54:07.446 INFO  (main) [   ] o.a.s.c.CorePropertiesLocator
Found 49 core definitions underneath /index/solr6/data
2017-07-13 15:54:07.447 INFO  (main) [   ] o.a.s.c.CorePropertiesLocator
Cores are: [s0build, ai-inclive, spark1live, inclive, spark5build,
ai-main, ncmain, s2build, sparkmain, ncrss, s4build, ai-1build, s3live,
spark7build, spark7live, sparkinclive, sparkrss, spark2build,
spark6build, spark1build, s5live, spark4build, ai-0live, s3build,
spark2live, ai-incbuild, spark9live, spark4live, spark8live, ai-rss,
s1build, spark0build, incbuild, spark5live, ai-1live, spark0live,
s5build, s2live, sparkincbuild, s0live, spark3live, s4live, spark8build,
spark3build, ai-0build, banana, s1live, spark6live, spark9build]
2017-07-13 15:54:07.549 INFO  (main) [   ] o.e.j.s.Server Started @2675ms

As I said, this does not affect how the system runs, but it does display
unsightly "Initialization Failures" on every page of the admin UI.

Has anyone else run into this?  Worth an issue?

Possible enhancement idea: A button to acknowledge initialization
failures and remove them from the UI display.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6 is trying to loading *some* (not all) cores more than once

Erick Erickson
I haven't seen anything like that, unsightly indeed.

I like the idea of the button to remove failed messages. The only
thing you see is the write lock exception, correct?

And since you say it fails on different cores at different times that
seems to rule out somehow you have more than one core pointing to the
same data dir.

Two questions:
1> I don't see "inc_1" in the list of cores discovered by the
corePropertiesLocator. Is this perhaps an alias?
2> Are you absolutely sure that you don't somehow have more than one
core pointing to the same data dir? this latter is unlikely as you say
all the cores work, just covering bases.

Erick

On Fri, Jul 14, 2017 at 9:34 AM, Shawn Heisey <[hidden email]> wrote:

> I have a situation at work where a dev system that I have just upgraded
> from 6.3.0 to 6.6.0 is trying to load a small number of cores more than
> once.  It's not always the same cores on a restart -- sometimes the list
> is different, and isn't always the same number of cores.
>
> The affected cores are working perfectly -- the first load appears to
> have no issues and is not affected by the second (failed) load attempt.
>
> Grepping for "inc_1" in the solr log (the final directory in the
> instanceDir of a core that was affected on that restart), I see that the
> core "inclive" is discovered and loaded, then a short time later, Solr
> attempts to load the same core again.
>
> 2017-07-13 15:54:10.168 INFO  (coreLoadExecutor-6-thread-1) [   ]
> o.a.s.c.CoreContainer Creating SolrCore 'inclive' using configuration
> from instancedir /index/solr6/data/cores/inc_1, trusted=true
> 2017-07-13 15:54:10.168 INFO  (coreLoadExecutor-6-thread-1) [   ]
> o.a.s.c.SolrCore [[inclive] ] Opening new SolrCore at
> [/index/solr6/data/cores/inc_1],
> dataDir=[/index/solr6/data/cores/inc_1/../../data/inc_1/]
> 2017-07-13 15:54:10.362 INFO  (coreLoadExecutor-6-thread-1) [   ]
> o.a.s.r.ManagedResourceStorage File-based storage initialized to use
> dir: /index/solr6/data/cores/inc_1/conf
> 2017-07-13 15:54:10.363 INFO  (coreLoadExecutor-6-thread-1) [   ]
> o.a.s.r.ManagedResourceStorage Loaded null at path _rest_managed.json
> using file:dir=/index/solr6/data/cores/inc_1/conf
> 2017-07-13 15:54:10.640 INFO  (qtp120960120-22) [   x:inclive]
> o.a.s.c.CoreContainer Creating SolrCore 'inclive' using configuration
> from instancedir /index/solr6/data/cores/inc_1, trusted=true
> 2017-07-13 15:54:10.641 INFO  (qtp120960120-22) [   x:inclive]
> o.a.s.c.SolrCore [[inclive] ] Opening new SolrCore at
> [/index/solr6/data/cores/inc_1],
> dataDir=[/index/solr6/data/cores/inc_1/../../data/inc_1/]
> Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held
> by this virtual machine: /index/solr6/data/data/inc_1/index/write.lock
>
> I have only defined the solr home, not the coreRootDirectory.  Here is
> the first part of the log where it shows the primary configuration
> information:
>
> 2017-07-13 15:54:06.231 INFO  (main) [   ] o.e.j.s.Server
> jetty-9.3.14.v20161028
> 2017-07-13 15:54:06.573 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> ___      _       Welcome to Apache Solr™ version 6.6.0
> 2017-07-13 15:54:06.574 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter /
> __| ___| |_ _   Starting in standalone mode on port 8982
> 2017-07-13 15:54:06.574 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> \__ \/ _ \ | '_|  Install dir: /opt/solr6
> 2017-07-13 15:54:06.590 INFO  (main) [   ] o.a.s.s.SolrDispatchFilter
> |___/\___/_|_|    Start time: 2017-07-13T15:54:06.575Z
> 2017-07-13 15:54:06.608 INFO  (main) [   ] o.a.s.c.SolrResourceLoader
> Using system property solr.solr.home: /index/solr6/data
> 2017-07-13 15:54:06.616 INFO  (main) [   ] o.a.s.c.SolrXmlConfig Loading
> container configuration from /index/solr6/data/solr.xml
> 2017-07-13 15:54:06.745 INFO  (main) [   ] o.a.s.c.SolrResourceLoader
> [null] Added 8 libs to classloader, from paths: [/index/solr6/data/lib]
> 2017-07-13 15:54:07.244 INFO  (main) [   ] o.a.s.u.UpdateShardHandler
> Creating UpdateShardHandler HTTP client with params:
> socketTimeout=600000&connTimeout=60000&retry=true
> 2017-07-13 15:54:07.446 INFO  (main) [   ] o.a.s.c.CorePropertiesLocator
> Found 49 core definitions underneath /index/solr6/data
> 2017-07-13 15:54:07.447 INFO  (main) [   ] o.a.s.c.CorePropertiesLocator
> Cores are: [s0build, ai-inclive, spark1live, inclive, spark5build,
> ai-main, ncmain, s2build, sparkmain, ncrss, s4build, ai-1build, s3live,
> spark7build, spark7live, sparkinclive, sparkrss, spark2build,
> spark6build, spark1build, s5live, spark4build, ai-0live, s3build,
> spark2live, ai-incbuild, spark9live, spark4live, spark8live, ai-rss,
> s1build, spark0build, incbuild, spark5live, ai-1live, spark0live,
> s5build, s2live, sparkincbuild, s0live, spark3live, s4live, spark8build,
> spark3build, ai-0build, banana, s1live, spark6live, spark9build]
> 2017-07-13 15:54:07.549 INFO  (main) [   ] o.e.j.s.Server Started @2675ms
>
> As I said, this does not affect how the system runs, but it does display
> unsightly "Initialization Failures" on every page of the admin UI.
>
> Has anyone else run into this?  Worth an issue?
>
> Possible enhancement idea: A button to acknowledge initialization
> failures and remove them from the UI display.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Solr 6.6 is trying to loading *some* (not all) cores more than once

Shawn Heisey-2
On 7/14/2017 10:48 AM, Erick Erickson wrote:

> I haven't seen anything like that, unsightly indeed.
>
> I like the idea of the button to remove failed messages. The only
> thing you see is the write lock exception, correct?
>
> And since you say it fails on different cores at different times that
> seems to rule out somehow you have more than one core pointing to the
> same data dir.
>
> Two questions:
> 1> I don't see "inc_1" in the list of cores discovered by the
> corePropertiesLocator. Is this perhaps an alias?
> 2> Are you absolutely sure that you don't somehow have more than one
> core pointing to the same data dir? this latter is unlikely as you say
> all the cores work, just covering bases.

The core is named inclive, its instanceDir ends in inc_1.  At the
moment, the core named incbuild is pointed at inc_0.  When those cores
are swapped, that will be reversed.  I went with this directory naming
scheme because cores get swapped on every full index rebuild and I did
not want to have a situation where a live core was pointed at a
directory with "build" in the name.  It's not running in cloud mode, so
there are no aliases.

There were no problems with 6.3.0 pointing at the same solr home.
Seeing the "write.lock" problem after the upgrade, I initially assumed
that lockfiles were left over from 6.3.0 ... so I stopped Solr, deleted
all the lockfiles, and started it back up.  That's when I saw that it
was being held by the running VM, not the previous one.

Thanks,
Shawn

Loading...