Error adding replica after a delete replica

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Error adding replica after a delete replica

WebsterHomer
A colleague of mine was testing how solrcloud replica recovery works. We
have had a lot of issues with replicas going into recovery mode, replicas
down and in recovery failed states.  So to test, he deleted a healthy
replica in one of our development. First the delete operation timed out,
but the replica appears to be gone. However, addReplica always fails with
this error:

Error CREATEing SolrCore 'sial-content-citations_shard1_replica1': Unable
to create core [sial-content-citations_shard1_replica1] Caused by: Lock
held by this virtual machine: /var/solr/data/sial-content-
citations_shard1_replica1/data/index/write.lock

This cloud has 4 nodes. The collection has two shards with two replicas per
shard. They are all hosted in a google cloud environment.

So if the delete deleted the replica why would it then hold a lock? We want
to understand this.

We are using Solr 6.2.0

--


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French,
Spanish and Portuguese versions of this disclaimer.
Reply | Threaded
Open this post in threaded view
|

Re: Error adding replica after a delete replica

Emir Arnautović
Hi,
How did you delete replica? Did you see any errors in logs after deleting? How did/does it look from ZK perspective after deleting that replica?

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 5 Oct 2017, at 16:17, Webster Homer <[hidden email]> wrote:
>
> A colleague of mine was testing how solrcloud replica recovery works. We
> have had a lot of issues with replicas going into recovery mode, replicas
> down and in recovery failed states.  So to test, he deleted a healthy
> replica in one of our development. First the delete operation timed out,
> but the replica appears to be gone. However, addReplica always fails with
> this error:
>
> Error CREATEing SolrCore 'sial-content-citations_shard1_replica1': Unable
> to create core [sial-content-citations_shard1_replica1] Caused by: Lock
> held by this virtual machine: /var/solr/data/sial-content-
> citations_shard1_replica1/data/index/write.lock
>
> This cloud has 4 nodes. The collection has two shards with two replicas per
> shard. They are all hosted in a google cloud environment.
>
> So if the delete deleted the replica why would it then hold a lock? We want
> to understand this.
>
> We are using Solr 6.2.0
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.

Reply | Threaded
Open this post in threaded view
|

Re: Error adding replica after a delete replica

WebsterHomer
The replica was deleted using the deleteReplica collections API call. The
call timed out, but eventually completed. However something still held a
write lock, and it was still held a day later, but the replica was removed
as far as we could tell in the solr admin console.

Since it was a development collection, we "fixed" the problem by deleting
the collection and re-creating it



On Fri, Oct 6, 2017 at 2:44 AM, Emir Arnautović <
[hidden email]> wrote:

> Hi,
> How did you delete replica? Did you see any errors in logs after deleting?
> How did/does it look from ZK perspective after deleting that replica?
>
> Thanks,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 5 Oct 2017, at 16:17, Webster Homer <[hidden email]> wrote:
> >
> > A colleague of mine was testing how solrcloud replica recovery works. We
> > have had a lot of issues with replicas going into recovery mode, replicas
> > down and in recovery failed states.  So to test, he deleted a healthy
> > replica in one of our development. First the delete operation timed out,
> > but the replica appears to be gone. However, addReplica always fails with
> > this error:
> >
> > Error CREATEing SolrCore 'sial-content-citations_shard1_replica1':
> Unable
> > to create core [sial-content-citations_shard1_replica1] Caused by: Lock
> > held by this virtual machine: /var/solr/data/sial-content-
> > citations_shard1_replica1/data/index/write.lock
> >
> > This cloud has 4 nodes. The collection has two shards with two replicas
> per
> > shard. They are all hosted in a google cloud environment.
> >
> > So if the delete deleted the replica why would it then hold a lock? We
> want
> > to understand this.
> >
> > We are using Solr 6.2.0
> >
> > --
> >
> >
> > This message and any attachment are confidential and may be privileged or
> > otherwise protected from disclosure. If you are not the intended
> recipient,
> > you must not copy this message or attachment or disclose the contents to
> > any other person. If you have received this transmission in error, please
> > notify the sender immediately and delete the message and any attachment
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not accept liability for any omissions or errors in this
> > message which may arise as a result of E-Mail-transmission or for damages
> > resulting from any unauthorized changes of the content of this message
> and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
> does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> >
> > Click http://www.emdgroup.com/disclaimer to access the German, French,
> > Spanish and Portuguese versions of this disclaimer.
>
>

--


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French,
Spanish and Portuguese versions of this disclaimer.
Reply | Threaded
Open this post in threaded view
|

Re: Error adding replica after a delete replica

Erick Erickson
for future reference, a less harsh way to fix it would be to
> stop the Solr instance where the replica resides
> rm -rf SOLR_HOME/collection1_replia1_shard1

where "collection1_replica1_shard1" is the directory of the replica in
question, you should see a "core.properties" file in that directory...

That said, this shouldn't be necessary, just in case.

Best,
Erick

On Fri, Oct 6, 2017 at 10:34 AM, Webster Homer <[hidden email]> wrote:

> The replica was deleted using the deleteReplica collections API call. The
> call timed out, but eventually completed. However something still held a
> write lock, and it was still held a day later, but the replica was removed
> as far as we could tell in the solr admin console.
>
> Since it was a development collection, we "fixed" the problem by deleting
> the collection and re-creating it
>
>
>
> On Fri, Oct 6, 2017 at 2:44 AM, Emir Arnautović <
> [hidden email]> wrote:
>
>> Hi,
>> How did you delete replica? Did you see any errors in logs after deleting?
>> How did/does it look from ZK perspective after deleting that replica?
>>
>> Thanks,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>> > On 5 Oct 2017, at 16:17, Webster Homer <[hidden email]> wrote:
>> >
>> > A colleague of mine was testing how solrcloud replica recovery works. We
>> > have had a lot of issues with replicas going into recovery mode, replicas
>> > down and in recovery failed states.  So to test, he deleted a healthy
>> > replica in one of our development. First the delete operation timed out,
>> > but the replica appears to be gone. However, addReplica always fails with
>> > this error:
>> >
>> > Error CREATEing SolrCore 'sial-content-citations_shard1_replica1':
>> Unable
>> > to create core [sial-content-citations_shard1_replica1] Caused by: Lock
>> > held by this virtual machine: /var/solr/data/sial-content-
>> > citations_shard1_replica1/data/index/write.lock
>> >
>> > This cloud has 4 nodes. The collection has two shards with two replicas
>> per
>> > shard. They are all hosted in a google cloud environment.
>> >
>> > So if the delete deleted the replica why would it then hold a lock? We
>> want
>> > to understand this.
>> >
>> > We are using Solr 6.2.0
>> >
>> > --
>> >
>> >
>> > This message and any attachment are confidential and may be privileged or
>> > otherwise protected from disclosure. If you are not the intended
>> recipient,
>> > you must not copy this message or attachment or disclose the contents to
>> > any other person. If you have received this transmission in error, please
>> > notify the sender immediately and delete the message and any attachment
>> > from your system. Merck KGaA, Darmstadt, Germany and any of its
>> > subsidiaries do not accept liability for any omissions or errors in this
>> > message which may arise as a result of E-Mail-transmission or for damages
>> > resulting from any unauthorized changes of the content of this message
>> and
>> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
>> > subsidiaries do not guarantee that this message is free of viruses and
>> does
>> > not accept liability for any damages caused by any virus transmitted
>> > therewith.
>> >
>> > Click http://www.emdgroup.com/disclaimer to access the German, French,
>> > Spanish and Portuguese versions of this disclaimer.
>>
>>
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.
Reply | Threaded
Open this post in threaded view
|

Re: Error adding replica after a delete replica

WebsterHomer
Unfortunately as developers we have no access to the actual solr nodes, and
certainly no privileges to delete stuff, even in the development
environment.

On Fri, Oct 6, 2017 at 1:34 PM, Erick Erickson <[hidden email]>
wrote:

> for future reference, a less harsh way to fix it would be to
> > stop the Solr instance where the replica resides
> > rm -rf SOLR_HOME/collection1_replia1_shard1
>
> where "collection1_replica1_shard1" is the directory of the replica in
> question, you should see a "core.properties" file in that directory...
>
> That said, this shouldn't be necessary, just in case.
>
> Best,
> Erick
>
> On Fri, Oct 6, 2017 at 10:34 AM, Webster Homer <[hidden email]>
> wrote:
> > The replica was deleted using the deleteReplica collections API call. The
> > call timed out, but eventually completed. However something still held a
> > write lock, and it was still held a day later, but the replica was
> removed
> > as far as we could tell in the solr admin console.
> >
> > Since it was a development collection, we "fixed" the problem by deleting
> > the collection and re-creating it
> >
> >
> >
> > On Fri, Oct 6, 2017 at 2:44 AM, Emir Arnautović <
> > [hidden email]> wrote:
> >
> >> Hi,
> >> How did you delete replica? Did you see any errors in logs after
> deleting?
> >> How did/does it look from ZK perspective after deleting that replica?
> >>
> >> Thanks,
> >> Emir
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >>
> >> > On 5 Oct 2017, at 16:17, Webster Homer <[hidden email]>
> wrote:
> >> >
> >> > A colleague of mine was testing how solrcloud replica recovery works.
> We
> >> > have had a lot of issues with replicas going into recovery mode,
> replicas
> >> > down and in recovery failed states.  So to test, he deleted a healthy
> >> > replica in one of our development. First the delete operation timed
> out,
> >> > but the replica appears to be gone. However, addReplica always fails
> with
> >> > this error:
> >> >
> >> > Error CREATEing SolrCore 'sial-content-citations_shard1_replica1':
> >> Unable
> >> > to create core [sial-content-citations_shard1_replica1] Caused by:
> Lock
> >> > held by this virtual machine: /var/solr/data/sial-content-
> >> > citations_shard1_replica1/data/index/write.lock
> >> >
> >> > This cloud has 4 nodes. The collection has two shards with two
> replicas
> >> per
> >> > shard. They are all hosted in a google cloud environment.
> >> >
> >> > So if the delete deleted the replica why would it then hold a lock? We
> >> want
> >> > to understand this.
> >> >
> >> > We are using Solr 6.2.0
> >> >
> >> > --
> >> >
> >> >
> >> > This message and any attachment are confidential and may be
> privileged or
> >> > otherwise protected from disclosure. If you are not the intended
> >> recipient,
> >> > you must not copy this message or attachment or disclose the contents
> to
> >> > any other person. If you have received this transmission in error,
> please
> >> > notify the sender immediately and delete the message and any
> attachment
> >> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> >> > subsidiaries do not accept liability for any omissions or errors in
> this
> >> > message which may arise as a result of E-Mail-transmission or for
> damages
> >> > resulting from any unauthorized changes of the content of this message
> >> and
> >> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> >> > subsidiaries do not guarantee that this message is free of viruses and
> >> does
> >> > not accept liability for any damages caused by any virus transmitted
> >> > therewith.
> >> >
> >> > Click http://www.emdgroup.com/disclaimer to access the German,
> French,
> >> > Spanish and Portuguese versions of this disclaimer.
> >>
> >>
> >
> > --
> >
> >
> > This message and any attachment are confidential and may be privileged or
> > otherwise protected from disclosure. If you are not the intended
> recipient,
> > you must not copy this message or attachment or disclose the contents to
> > any other person. If you have received this transmission in error, please
> > notify the sender immediately and delete the message and any attachment
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not accept liability for any omissions or errors in this
> > message which may arise as a result of E-Mail-transmission or for damages
> > resulting from any unauthorized changes of the content of this message
> and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
> does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> >
> > Click http://www.emdgroup.com/disclaimer to access the German, French,
> > Spanish and Portuguese versions of this disclaimer.
>

--


This message and any attachment are confidential and may be privileged or
otherwise protected from disclosure. If you are not the intended recipient,
you must not copy this message or attachment or disclose the contents to
any other person. If you have received this transmission in error, please
notify the sender immediately and delete the message and any attachment
from your system. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not accept liability for any omissions or errors in this
message which may arise as a result of E-Mail-transmission or for damages
resulting from any unauthorized changes of the content of this message and
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
subsidiaries do not guarantee that this message is free of viruses and does
not accept liability for any damages caused by any virus transmitted
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French,
Spanish and Portuguese versions of this disclaimer.