Do backups of collections need to be taken on the Leader?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Do backups of collections need to be taken on the Leader?

Koen De Groote
I'm trying to restore a couple of collections, and 1 keeps feeling. This
happens to be the only one who's leader isn't on the host that the backup
was taken from.


The backup was done on server1, for all collections.

For this collection that is failing, the Leader was on server2. All other
collections had their leader on server1. All collections had 1 replica, on
the other server.

I would think that having the replica there would be enough to perform a
restore.

Or does the backup need to happen on the actual leader?

Kind regards,
Koen De Groote
Reply | Threaded
Open this post in threaded view
|

Re: Do backups of collections need to be taken on the Leader?

Jon Kjær Amundsen
Hi Koen

A quick sanity check:
Do you use a network drive accessible from both servers to make the backup
to?
If you've backed up server2's collection to a local disk, then when you're
trying to restore it via server1 it does not know anything about the backup.

Venlig hilsen/Best regards

*Jon Kjær Amundsen*
Developer


Phone: +45 7023 9080
E-mail: [hidden email]
Web: www.udbudsvagten.dk
Parken - Tårn D - 5. Sal
Øster Allé 48 | DK - 2100 København

<http://dk.linkedin.com/in/JonKjaerAmundsen/>

Intelligent Offentlig Samhandel
*Før, under og efter udbud*

*Følg UdbudsVagten og markedet her Linkedin
<http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *


Den ons. 16. okt. 2019 kl. 17.42 skrev Koen De Groote <
[hidden email]>:

> I'm trying to restore a couple of collections, and 1 keeps feeling. This
> happens to be the only one who's leader isn't on the host that the backup
> was taken from.
>
>
> The backup was done on server1, for all collections.
>
> For this collection that is failing, the Leader was on server2. All other
> collections had their leader on server1. All collections had 1 replica, on
> the other server.
>
> I would think that having the replica there would be enough to perform a
> restore.
>
> Or does the backup need to happen on the actual leader?
>
> Kind regards,
> Koen De Groote
>
Reply | Threaded
Open this post in threaded view
|

Re: Do backups of collections need to be taken on the Leader?

Koen De Groote
Yes, both servers back up to a network drive.

However, that is not the point of my question.

The point of my question is: If I execute the curl command, that contacts
the collections API, to perform the backup, does it matter that the leader
is on a different host from the one where the backup command was executed?



On Thu, Oct 17, 2019 at 9:30 AM Jon Kjær Amundsen <[hidden email]>
wrote:

> Hi Koen
>
> A quick sanity check:
> Do you use a network drive accessible from both servers to make the backup
> to?
> If you've backed up server2's collection to a local disk, then when you're
> trying to restore it via server1 it does not know anything about the
> backup.
>
> Venlig hilsen/Best regards
>
> *Jon Kjær Amundsen*
> Developer
>
>
> Phone: +45 7023 9080
> E-mail: [hidden email]
> Web: www.udbudsvagten.dk
> Parken - Tårn D - 5. Sal
> Øster Allé 48 | DK - 2100 København
>
> <http://dk.linkedin.com/in/JonKjaerAmundsen/>
>
> Intelligent Offentlig Samhandel
> *Før, under og efter udbud*
>
> *Følg UdbudsVagten og markedet her Linkedin
> <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
>
>
> Den ons. 16. okt. 2019 kl. 17.42 skrev Koen De Groote <
> [hidden email]>:
>
> > I'm trying to restore a couple of collections, and 1 keeps feeling. This
> > happens to be the only one who's leader isn't on the host that the backup
> > was taken from.
> >
> >
> > The backup was done on server1, for all collections.
> >
> > For this collection that is failing, the Leader was on server2. All other
> > collections had their leader on server1. All collections had 1 replica,
> on
> > the other server.
> >
> > I would think that having the replica there would be enough to perform a
> > restore.
> >
> > Or does the backup need to happen on the actual leader?
> >
> > Kind regards,
> > Koen De Groote
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Do backups of collections need to be taken on the Leader?

Koen De Groote
Both to the same network drive, sorry.

On Thu, Oct 17, 2019 at 10:05 AM Koen De Groote <[hidden email]>
wrote:

> Yes, both servers back up to a network drive.
>
> However, that is not the point of my question.
>
> The point of my question is: If I execute the curl command, that contacts
> the collections API, to perform the backup, does it matter that the leader
> is on a different host from the one where the backup command was executed?
>
>
>
> On Thu, Oct 17, 2019 at 9:30 AM Jon Kjær Amundsen <[hidden email]>
> wrote:
>
>> Hi Koen
>>
>> A quick sanity check:
>> Do you use a network drive accessible from both servers to make the backup
>> to?
>> If you've backed up server2's collection to a local disk, then when you're
>> trying to restore it via server1 it does not know anything about the
>> backup.
>>
>> Venlig hilsen/Best regards
>>
>> *Jon Kjær Amundsen*
>> Developer
>>
>>
>> Phone: +45 7023 9080
>> E-mail: [hidden email]
>> Web: www.udbudsvagten.dk
>> Parken - Tårn D - 5. Sal
>> Øster Allé 48 | DK - 2100 København
>>
>> <http://dk.linkedin.com/in/JonKjaerAmundsen/>
>>
>> Intelligent Offentlig Samhandel
>> *Før, under og efter udbud*
>>
>> *Følg UdbudsVagten og markedet her Linkedin
>> <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
>>
>>
>> Den ons. 16. okt. 2019 kl. 17.42 skrev Koen De Groote <
>> [hidden email]>:
>>
>> > I'm trying to restore a couple of collections, and 1 keeps feeling. This
>> > happens to be the only one who's leader isn't on the host that the
>> backup
>> > was taken from.
>> >
>> >
>> > The backup was done on server1, for all collections.
>> >
>> > For this collection that is failing, the Leader was on server2. All
>> other
>> > collections had their leader on server1. All collections had 1 replica,
>> on
>> > the other server.
>> >
>> > I would think that having the replica there would be enough to perform a
>> > restore.
>> >
>> > Or does the backup need to happen on the actual leader?
>> >
>> > Kind regards,
>> > Koen De Groote
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Do backups of collections need to be taken on the Leader?

Jon Kjær Amundsen
In reply to this post by Koen De Groote
As a restore is server agnostic (i.e. you can restore to a totally
different host than the backup was taken from) hat shouldn't be the problem.
Also a collection as such has no leader, only shards have.

Do you have any kind of logs stating the errors encountered?

Venlig hilsen/Best regards

*Jon Kjær Amundsen*
Developer


Phone: +45 7023 9080
E-mail: [hidden email]
Web: www.udbudsvagten.dk
Parken - Tårn D - 5. Sal
Øster Allé 48 | DK - 2100 København

<http://dk.linkedin.com/in/JonKjaerAmundsen/>

Intelligent Offentlig Samhandel
*Før, under og efter udbud*

*Følg UdbudsVagten og markedet her Linkedin
<http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *


Den tor. 17. okt. 2019 kl. 10.06 skrev Koen De Groote <
[hidden email]>:

> Yes, both servers back up to a network drive.
>
> However, that is not the point of my question.
>
> The point of my question is: If I execute the curl command, that contacts
> the collections API, to perform the backup, does it matter that the leader
> is on a different host from the one where the backup command was executed?
>
>
>
> On Thu, Oct 17, 2019 at 9:30 AM Jon Kjær Amundsen <[hidden email]>
> wrote:
>
> > Hi Koen
> >
> > A quick sanity check:
> > Do you use a network drive accessible from both servers to make the
> backup
> > to?
> > If you've backed up server2's collection to a local disk, then when
> you're
> > trying to restore it via server1 it does not know anything about the
> > backup.
> >
> > Venlig hilsen/Best regards
> >
> > *Jon Kjær Amundsen*
> > Developer
> >
> >
> > Phone: +45 7023 9080
> > E-mail: [hidden email]
> > Web: www.udbudsvagten.dk
> > Parken - Tårn D - 5. Sal
> > Øster Allé 48 | DK - 2100 København
> >
> > <http://dk.linkedin.com/in/JonKjaerAmundsen/>
> >
> > Intelligent Offentlig Samhandel
> > *Før, under og efter udbud*
> >
> > *Følg UdbudsVagten og markedet her Linkedin
> > <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
> >
> >
> > Den ons. 16. okt. 2019 kl. 17.42 skrev Koen De Groote <
> > [hidden email]>:
> >
> > > I'm trying to restore a couple of collections, and 1 keeps feeling.
> This
> > > happens to be the only one who's leader isn't on the host that the
> backup
> > > was taken from.
> > >
> > >
> > > The backup was done on server1, for all collections.
> > >
> > > For this collection that is failing, the Leader was on server2. All
> other
> > > collections had their leader on server1. All collections had 1 replica,
> > on
> > > the other server.
> > >
> > > I would think that having the replica there would be enough to perform
> a
> > > restore.
> > >
> > > Or does the backup need to happen on the actual leader?
> > >
> > > Kind regards,
> > > Koen De Groote
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Do backups of collections need to be taken on the Leader?

Koen De Groote
The error was a zookeeper connect timeout. Which apparently is hardcoded to
180 seconds.

I've recently succeeded in the restore. It may well have been a connection
issue, since the environment is a shared VM environment. Outside pressure
is possible.

The timeout source code(in version 7.6.0):
https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L301

And eventually it gets used here:
https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L359

This makes me wonder: is there a hard limit? The restore can only take 180
seconds or it fails? Or is that timeout per connection attempt to zookeeper?



On Thu, Oct 17, 2019 at 11:16 AM Jon Kjær Amundsen <[hidden email]>
wrote:

> As a restore is server agnostic (i.e. you can restore to a totally
> different host than the backup was taken from) hat shouldn't be the
> problem.
> Also a collection as such has no leader, only shards have.
>
> Do you have any kind of logs stating the errors encountered?
>
> Venlig hilsen/Best regards
>
> *Jon Kjær Amundsen*
> Developer
>
>
> Phone: +45 7023 9080
> E-mail: [hidden email]
> Web: www.udbudsvagten.dk
> Parken - Tårn D - 5. Sal
> Øster Allé 48 | DK - 2100 København
>
> <http://dk.linkedin.com/in/JonKjaerAmundsen/>
>
> Intelligent Offentlig Samhandel
> *Før, under og efter udbud*
>
> *Følg UdbudsVagten og markedet her Linkedin
> <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
>
>
> Den tor. 17. okt. 2019 kl. 10.06 skrev Koen De Groote <
> [hidden email]>:
>
> > Yes, both servers back up to a network drive.
> >
> > However, that is not the point of my question.
> >
> > The point of my question is: If I execute the curl command, that contacts
> > the collections API, to perform the backup, does it matter that the
> leader
> > is on a different host from the one where the backup command was
> executed?
> >
> >
> >
> > On Thu, Oct 17, 2019 at 9:30 AM Jon Kjær Amundsen <[hidden email]>
> > wrote:
> >
> > > Hi Koen
> > >
> > > A quick sanity check:
> > > Do you use a network drive accessible from both servers to make the
> > backup
> > > to?
> > > If you've backed up server2's collection to a local disk, then when
> > you're
> > > trying to restore it via server1 it does not know anything about the
> > > backup.
> > >
> > > Venlig hilsen/Best regards
> > >
> > > *Jon Kjær Amundsen*
> > > Developer
> > >
> > >
> > > Phone: +45 7023 9080
> > > E-mail: [hidden email]
> > > Web: www.udbudsvagten.dk
> > > Parken - Tårn D - 5. Sal
> > > Øster Allé 48 | DK - 2100 København
> > >
> > > <http://dk.linkedin.com/in/JonKjaerAmundsen/>
> > >
> > > Intelligent Offentlig Samhandel
> > > *Før, under og efter udbud*
> > >
> > > *Følg UdbudsVagten og markedet her Linkedin
> > > <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
> > >
> > >
> > > Den ons. 16. okt. 2019 kl. 17.42 skrev Koen De Groote <
> > > [hidden email]>:
> > >
> > > > I'm trying to restore a couple of collections, and 1 keeps feeling.
> > This
> > > > happens to be the only one who's leader isn't on the host that the
> > backup
> > > > was taken from.
> > > >
> > > >
> > > > The backup was done on server1, for all collections.
> > > >
> > > > For this collection that is failing, the Leader was on server2. All
> > other
> > > > collections had their leader on server1. All collections had 1
> replica,
> > > on
> > > > the other server.
> > > >
> > > > I would think that having the replica there would be enough to
> perform
> > a
> > > > restore.
> > > >
> > > > Or does the backup need to happen on the actual leader?
> > > >
> > > > Kind regards,
> > > > Koen De Groote
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Do backups of collections need to be taken on the Leader?

Koen De Groote
Is "it" a hard limit, sorry.

On Thu, Oct 17, 2019 at 1:35 PM Koen De Groote <[hidden email]>
wrote:

> The error was a zookeeper connect timeout. Which apparently is hardcoded
> to 180 seconds.
>
> I've recently succeeded in the restore. It may well have been a connection
> issue, since the environment is a shared VM environment. Outside pressure
> is possible.
>
> The timeout source code(in version 7.6.0):
> https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L301
>
> And eventually it gets used here:
> https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L359
>
> This makes me wonder: is there a hard limit? The restore can only take 180
> seconds or it fails? Or is that timeout per connection attempt to zookeeper?
>
>
>
> On Thu, Oct 17, 2019 at 11:16 AM Jon Kjær Amundsen <[hidden email]>
> wrote:
>
>> As a restore is server agnostic (i.e. you can restore to a totally
>> different host than the backup was taken from) hat shouldn't be the
>> problem.
>> Also a collection as such has no leader, only shards have.
>>
>> Do you have any kind of logs stating the errors encountered?
>>
>> Venlig hilsen/Best regards
>>
>> *Jon Kjær Amundsen*
>> Developer
>>
>>
>> Phone: +45 7023 9080
>> E-mail: [hidden email]
>> Web: www.udbudsvagten.dk
>> Parken - Tårn D - 5. Sal
>> Øster Allé 48 | DK - 2100 København
>>
>> <http://dk.linkedin.com/in/JonKjaerAmundsen/>
>>
>> Intelligent Offentlig Samhandel
>> *Før, under og efter udbud*
>>
>> *Følg UdbudsVagten og markedet her Linkedin
>> <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
>>
>>
>> Den tor. 17. okt. 2019 kl. 10.06 skrev Koen De Groote <
>> [hidden email]>:
>>
>> > Yes, both servers back up to a network drive.
>> >
>> > However, that is not the point of my question.
>> >
>> > The point of my question is: If I execute the curl command, that
>> contacts
>> > the collections API, to perform the backup, does it matter that the
>> leader
>> > is on a different host from the one where the backup command was
>> executed?
>> >
>> >
>> >
>> > On Thu, Oct 17, 2019 at 9:30 AM Jon Kjær Amundsen <[hidden email]>
>> > wrote:
>> >
>> > > Hi Koen
>> > >
>> > > A quick sanity check:
>> > > Do you use a network drive accessible from both servers to make the
>> > backup
>> > > to?
>> > > If you've backed up server2's collection to a local disk, then when
>> > you're
>> > > trying to restore it via server1 it does not know anything about the
>> > > backup.
>> > >
>> > > Venlig hilsen/Best regards
>> > >
>> > > *Jon Kjær Amundsen*
>> > > Developer
>> > >
>> > >
>> > > Phone: +45 7023 9080
>> > > E-mail: [hidden email]
>> > > Web: www.udbudsvagten.dk
>> > > Parken - Tårn D - 5. Sal
>> > > Øster Allé 48 | DK - 2100 København
>> > >
>> > > <http://dk.linkedin.com/in/JonKjaerAmundsen/>
>> > >
>> > > Intelligent Offentlig Samhandel
>> > > *Før, under og efter udbud*
>> > >
>> > > *Følg UdbudsVagten og markedet her Linkedin
>> > > <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
>> > >
>> > >
>> > > Den ons. 16. okt. 2019 kl. 17.42 skrev Koen De Groote <
>> > > [hidden email]>:
>> > >
>> > > > I'm trying to restore a couple of collections, and 1 keeps feeling.
>> > This
>> > > > happens to be the only one who's leader isn't on the host that the
>> > backup
>> > > > was taken from.
>> > > >
>> > > >
>> > > > The backup was done on server1, for all collections.
>> > > >
>> > > > For this collection that is failing, the Leader was on server2. All
>> > other
>> > > > collections had their leader on server1. All collections had 1
>> replica,
>> > > on
>> > > > the other server.
>> > > >
>> > > > I would think that having the replica there would be enough to
>> perform
>> > a
>> > > > restore.
>> > > >
>> > > > Or does the backup need to happen on the actual leader?
>> > > >
>> > > > Kind regards,
>> > > > Koen De Groote
>> > > >
>> > >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Do backups of collections need to be taken on the Leader?

Jon Kjær Amundsen
We have restores that take longer but do not have problems with timeouts.
But we use the async parameter
https://lucene.apache.org/solr/guide/7_2/collections-api.html#CollectionsAPI-restore

It could seem from the code you provided that it will timeout after 180
seconds if you do not make the restore async,

Venlig hilsen/Best regards

*Jon Kjær Amundsen*
Developer


Phone: +45 7023 9080
E-mail: [hidden email]
Web: www.udbudsvagten.dk
Parken - Tårn D - 5. Sal
Øster Allé 48 | DK - 2100 København

<http://dk.linkedin.com/in/JonKjaerAmundsen/>

Intelligent Offentlig Samhandel
*Før, under og efter udbud*

*Følg UdbudsVagten og markedet her Linkedin
<http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *


Den tor. 17. okt. 2019 kl. 13.36 skrev Koen De Groote <
[hidden email]>:

> Is "it" a hard limit, sorry.
>
> On Thu, Oct 17, 2019 at 1:35 PM Koen De Groote <
> [hidden email]>
> wrote:
>
> > The error was a zookeeper connect timeout. Which apparently is hardcoded
> > to 180 seconds.
> >
> > I've recently succeeded in the restore. It may well have been a
> connection
> > issue, since the environment is a shared VM environment. Outside pressure
> > is possible.
> >
> > The timeout source code(in version 7.6.0):
> >
> https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L301
> >
> > And eventually it gets used here:
> >
> https://github.com/apache/lucene-solr/blob/1d85cd783863f75cea133fb9c452302214165a4d/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L359
> >
> > This makes me wonder: is there a hard limit? The restore can only take
> 180
> > seconds or it fails? Or is that timeout per connection attempt to
> zookeeper?
> >
> >
> >
> > On Thu, Oct 17, 2019 at 11:16 AM Jon Kjær Amundsen <[hidden email]>
> > wrote:
> >
> >> As a restore is server agnostic (i.e. you can restore to a totally
> >> different host than the backup was taken from) hat shouldn't be the
> >> problem.
> >> Also a collection as such has no leader, only shards have.
> >>
> >> Do you have any kind of logs stating the errors encountered?
> >>
> >> Venlig hilsen/Best regards
> >>
> >> *Jon Kjær Amundsen*
> >> Developer
> >>
> >>
> >> Phone: +45 7023 9080
> >> E-mail: [hidden email]
> >> Web: www.udbudsvagten.dk
> >> Parken - Tårn D - 5. Sal
> >> Øster Allé 48 | DK - 2100 København
> >>
> >> <http://dk.linkedin.com/in/JonKjaerAmundsen/>
> >>
> >> Intelligent Offentlig Samhandel
> >> *Før, under og efter udbud*
> >>
> >> *Følg UdbudsVagten og markedet her Linkedin
> >> <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
> >>
> >>
> >> Den tor. 17. okt. 2019 kl. 10.06 skrev Koen De Groote <
> >> [hidden email]>:
> >>
> >> > Yes, both servers back up to a network drive.
> >> >
> >> > However, that is not the point of my question.
> >> >
> >> > The point of my question is: If I execute the curl command, that
> >> contacts
> >> > the collections API, to perform the backup, does it matter that the
> >> leader
> >> > is on a different host from the one where the backup command was
> >> executed?
> >> >
> >> >
> >> >
> >> > On Thu, Oct 17, 2019 at 9:30 AM Jon Kjær Amundsen <[hidden email]
> >
> >> > wrote:
> >> >
> >> > > Hi Koen
> >> > >
> >> > > A quick sanity check:
> >> > > Do you use a network drive accessible from both servers to make the
> >> > backup
> >> > > to?
> >> > > If you've backed up server2's collection to a local disk, then when
> >> > you're
> >> > > trying to restore it via server1 it does not know anything about the
> >> > > backup.
> >> > >
> >> > > Venlig hilsen/Best regards
> >> > >
> >> > > *Jon Kjær Amundsen*
> >> > > Developer
> >> > >
> >> > >
> >> > > Phone: +45 7023 9080
> >> > > E-mail: [hidden email]
> >> > > Web: www.udbudsvagten.dk
> >> > > Parken - Tårn D - 5. Sal
> >> > > Øster Allé 48 | DK - 2100 København
> >> > >
> >> > > <http://dk.linkedin.com/in/JonKjaerAmundsen/>
> >> > >
> >> > > Intelligent Offentlig Samhandel
> >> > > *Før, under og efter udbud*
> >> > >
> >> > > *Følg UdbudsVagten og markedet her Linkedin
> >> > > <http://www.linkedin.com/groups?groupDashboard=&gid=1862353> *
> >> > >
> >> > >
> >> > > Den ons. 16. okt. 2019 kl. 17.42 skrev Koen De Groote <
> >> > > [hidden email]>:
> >> > >
> >> > > > I'm trying to restore a couple of collections, and 1 keeps
> feeling.
> >> > This
> >> > > > happens to be the only one who's leader isn't on the host that the
> >> > backup
> >> > > > was taken from.
> >> > > >
> >> > > >
> >> > > > The backup was done on server1, for all collections.
> >> > > >
> >> > > > For this collection that is failing, the Leader was on server2.
> All
> >> > other
> >> > > > collections had their leader on server1. All collections had 1
> >> replica,
> >> > > on
> >> > > > the other server.
> >> > > >
> >> > > > I would think that having the replica there would be enough to
> >> perform
> >> > a
> >> > > > restore.
> >> > > >
> >> > > > Or does the backup need to happen on the actual leader?
> >> > > >
> >> > > > Kind regards,
> >> > > > Koen De Groote
> >> > > >
> >> > >
> >> >
> >>
> >
>