SolrCloud: How best to do backups?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

SolrCloud: How best to do backups?

Kelly, Frank
We have a large SolrCloud deployment on AWS (350m documents spread across 3 collections, each with 3 shards and 3 replicas)
Running on 3 x r3.xlarge’s with the data stored on EBS drives with Provisioned IOPS

Currently it’s handling 38m requests per day

My question is how best should we back-up the search index?
Is there someway to snapshot a backup while Solr remains online that doesn’t horribly affect performance?

Right now in the event of a catastrophic failure if would take several weeks to reindex the data again based on the process we have now (which is outdated)

-Frank

 

Frank Kelly

Principal Software Engineer

AAA Identity Profile Team (SCBE / CDA) 


HERE 

5 Wayside Rd, Burlington, MA 01803, USA

42° 29' 7" N 71° 11' 32" W

 

               
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud: How best to do backups?

JohnB
Hmmm...

Can you (fairly quickly) reproduce this AWS environment (including the
indexes)?  Or does it require that several week process to provision new
Solr boxes...?

What happens now if one of those ec2 instances gets into trouble?  Do you
have autoscaling groups set up?

On Thu, Feb 8, 2018 at 1:44 PM, Kelly, Frank <[hidden email]> wrote:

> We have a large SolrCloud deployment on AWS (350m documents spread across
> 3 collections, each with 3 shards and 3 replicas)
> Running on 3 x r3.xlarge’s with the data stored on EBS drives with
> Provisioned IOPS
>
> Currently it’s handling 38m requests per day
>
> My question is how best should we back-up the search index?
> Is there someway to snapshot a backup while Solr remains online that
> doesn’t horribly affect performance?
>
> Right now in the event of a catastrophic failure if would take several
> weeks to reindex the data again based on the process we have now (which is
> outdated)
>
> -Frank
>
> [image: Description: Macintosh
> HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRGB:PDF:HERE_Logo_2016_POS_sRGB.pdf]
>
>
>
> *Frank Kelly*
>
> *Principal Software Engineer*
>
> AAA Identity Profile Team (SCBE / CDA)
>
>
> HERE
>
> 5 Wayside Rd, Burlington, MA 01803, USA
> <https://maps.google.com/?q=5+Wayside+Rd,+Burlington,+MA+01803,+USA&entry=gmail&source=g>
>
> *42° 29' 7" N 71° 11' 32" W*
>
>
> [image: Description:
> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_360.gif]
> <http://360.here.com/>    [image: Description:
> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Twitter.gif]
> <https://www.twitter.com/here>   [image: Description:
> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_FB.gif]
> <https://www.facebook.com/here>    [image: Description:
> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_IN.gif]
> <https://www.linkedin.com/company/heremaps>    [image: Description:
> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Insta.gif]
> <https://www.instagram.com/here/>
>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud: How best to do backups?

JohnB
This article may be of some use...

What isn't clear is what effect either of the two strategies mentioned
would have on serving responses to queries...  It would be nice if the
backup was a "low priority thread" compared to the needs of the server in
question, but I've never had to dig that deep before...

https://n2ws.com/how-to-guides/automate-amazon-ec2-instance-backup.html

On Thu, Feb 8, 2018 at 2:00 PM, John Bickerstaff <[hidden email]>
wrote:

> Hmmm...
>
> Can you (fairly quickly) reproduce this AWS environment (including the
> indexes)?  Or does it require that several week process to provision new
> Solr boxes...?
>
> What happens now if one of those ec2 instances gets into trouble?  Do you
> have autoscaling groups set up?
>
> On Thu, Feb 8, 2018 at 1:44 PM, Kelly, Frank <[hidden email]> wrote:
>
>> We have a large SolrCloud deployment on AWS (350m documents spread across
>> 3 collections, each with 3 shards and 3 replicas)
>> Running on 3 x r3.xlarge’s with the data stored on EBS drives with
>> Provisioned IOPS
>>
>> Currently it’s handling 38m requests per day
>>
>> My question is how best should we back-up the search index?
>> Is there someway to snapshot a backup while Solr remains online that
>> doesn’t horribly affect performance?
>>
>> Right now in the event of a catastrophic failure if would take several
>> weeks to reindex the data again based on the process we have now (which is
>> outdated)
>>
>> -Frank
>>
>> [image: Description: Macintosh
>> HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRGB:PDF:HERE_Logo_2016_POS_sRGB.pdf]
>>
>>
>>
>> *Frank Kelly*
>>
>> *Principal Software Engineer*
>>
>> AAA Identity Profile Team (SCBE / CDA)
>>
>>
>> HERE
>>
>> 5 Wayside Rd, Burlington, MA 01803, USA
>> <https://maps.google.com/?q=5+Wayside+Rd,+Burlington,+MA+01803,+USA&entry=gmail&source=g>
>>
>> *42° 29' 7" N 71° 11' 32" W*
>>
>>
>> [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_360.gif]
>> <http://360.here.com/>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Twitter.gif]
>> <https://www.twitter.com/here>   [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_FB.gif]
>> <https://www.facebook.com/here>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_IN.gif]
>> <https://www.linkedin.com/company/heremaps>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Insta.gif]
>> <https://www.instagram.com/here/>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

RE: SolrCloud: How best to do backups?

Davis, Daniel (NIH/NLM) [C]
I would suggest you have a separate EBS to save the backup from each server.   These EBS volumes would be mounted all the time, but only modified by a backup.  

Then, you can create an AWS Lambda function that runs on a periodic trigger from CloudWatch, and does the following:

- run the backup (by calling https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html#solrcloud-backups)
- snapshot the EBS volume

If you lose a node, you provision a new backup EBS based on the latest snapshot.   So, if you lost the node during the backup, the latest snapshot is from earlier.

If you can stop indexing to these three servers periodically, you can also just make a snapshot of their primary EBS, and use that for the restore (because the index is not being updated).
 
Does this make any sense?

-----Original Message-----
From: John Bickerstaff [mailto:[hidden email]]
Sent: Thursday, February 8, 2018 4:06 PM
To: [hidden email]
Subject: Re: SolrCloud: How best to do backups?

This article may be of some use...

What isn't clear is what effect either of the two strategies mentioned would have on serving responses to queries...  It would be nice if the backup was a "low priority thread" compared to the needs of the server in question, but I've never had to dig that deep before...

https://n2ws.com/how-to-guides/automate-amazon-ec2-instance-backup.html

On Thu, Feb 8, 2018 at 2:00 PM, John Bickerstaff <[hidden email]>
wrote:

> Hmmm...
>
> Can you (fairly quickly) reproduce this AWS environment (including the
> indexes)?  Or does it require that several week process to provision
> new Solr boxes...?
>
> What happens now if one of those ec2 instances gets into trouble?  Do
> you have autoscaling groups set up?
>
> On Thu, Feb 8, 2018 at 1:44 PM, Kelly, Frank <[hidden email]> wrote:
>
>> We have a large SolrCloud deployment on AWS (350m documents spread
>> across
>> 3 collections, each with 3 shards and 3 replicas) Running on 3 x
>> r3.xlarge’s with the data stored on EBS drives with Provisioned IOPS
>>
>> Currently it’s handling 38m requests per day
>>
>> My question is how best should we back-up the search index?
>> Is there someway to snapshot a backup while Solr remains online that
>> doesn’t horribly affect performance?
>>
>> Right now in the event of a catastrophic failure if would take
>> several weeks to reindex the data again based on the process we have
>> now (which is
>> outdated)
>>
>> -Frank
>>
>> [image: Description: Macintosh
>> HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRG
>> B:PDF:HERE_Logo_2016_POS_sRGB.pdf]
>>
>>
>>
>> *Frank Kelly*
>>
>> *Principal Software Engineer*
>>
>> AAA Identity Profile Team (SCBE / CDA)
>>
>>
>> HERE
>>
>> 5 Wayside Rd, Burlington, MA 01803, USA
>> <https://maps.google.com/?q=5+Wayside+Rd,+Burlington,+MA+01803,+USA&e
>> ntry=gmail&source=g>
>>
>> *42° 29' 7" N 71° 11' 32" W*
>>
>>
>> [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_360.gif]
>> <http://360.here.com/>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_Twitter.gif]
>> <https://www.twitter.com/here>   [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_FB.gif]
>> <https://www.facebook.com/here>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Images/20160726_HERE_EMail_Signature_IN.gif]
>> <https://www.linkedin.com/company/heremaps>    [image: Description:
>> /Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/
>> _Images/20160726_HERE_EMail_Signature_Insta.gif]
>> <https://www.instagram.com/here/>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud: How best to do backups?

Kelly, Frank
In reply to this post by JohnB
Sorry - just got back to this

1. We can standup the AWS resources quickly (~ 30 mins) but the process of
repopulating the index is very slow (< 1k docs per second). We need to fix
this but I¹m hoping to a backup solution would be a mitigation in the
meantime.

2. Yes we have autoscaling (with a process to add replicas) if we lose an
instance

3. The process you linked in the other reply that describes how to backup
SolrCloud only works if you have one EBS instance - if you have multiple
(one EBS instance for each SolrCloud VM) then you have the challenge of
making sure the EBS nodes are backed up from the exact same time
(otherwise you might risk some corruption of indexes - as they are
captured at slightly different times)

-Frank

 
Frank Kelly
Principal Software Engineer
AAA Identity Profile Team (SCBE / CDA)

HERE
5 Wayside Rd, Burlington, MA 01803, USA
42° 29' 7" N 71° 11' 32" W
 
 <http://360.here.com/>     <https://www.twitter.com/here>
<https://www.facebook.com/here>
<https://www.linkedin.com/company/heremaps>
<https://www.instagram.com/here/>



On 2/8/18, 4:00 PM, "John Bickerstaff" <[hidden email]> wrote:

>Hmmm...
>
>Can you (fairly quickly) reproduce this AWS environment (including the
>indexes)?  Or does it require that several week process to provision new
>Solr boxes...?
>
>What happens now if one of those ec2 instances gets into trouble?  Do you
>have autoscaling groups set up?
>
>On Thu, Feb 8, 2018 at 1:44 PM, Kelly, Frank <[hidden email]> wrote:
>
>> We have a large SolrCloud deployment on AWS (350m documents spread
>>across
>> 3 collections, each with 3 shards and 3 replicas)
>> Running on 3 x r3.xlarge¹s with the data stored on EBS drives with
>> Provisioned IOPS
>>
>> Currently it¹s handling 38m requests per day
>>
>> My question is how best should we back-up the search index?
>> Is there someway to snapshot a backup while Solr remains online that
>> doesn¹t horribly affect performance?
>>
>> Right now in the event of a catastrophic failure if would take several
>> weeks to reindex the data again based on the process we have now (which
>>is
>> outdated)
>>
>> -Frank
>>
>> [image: Description: Macintosh
>>
>>HD:Users:jerchow:Downloads:Asset_Package_01_160721:HERE_Logo_2016:sRGB:PD
>>F:HERE_Logo_2016_POS_sRGB.pdf]
>>
>>
>>
>> *Frank Kelly*
>>
>> *Principal Software Engineer*
>>
>> AAA Identity Profile Team (SCBE / CDA)
>>
>>
>> HERE
>>
>> 5 Wayside Rd, Burlington, MA 01803, USA
>>
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmaps.
>>google.com%2F%3Fq%3D5%2BWayside%2BRd%2C%2BBurlington%2C%2BMA%2B01803%2C%2
>>BUSA%26entry%3Dgmail%26source%3Dg&data=01%7C01%7C%7C2119a5eb0cc44d3456800
>>8d56f36fa6d%7C01862c52f5c249ce81db14f1034b32a4%7C1&sdata=nopXnJbDmNcOLM%2
>>BKE7u%2B1e5VSD3TFaHn%2Fq866ow8c8A%3D&reserved=0>
>>
>> *42° 29' 7" N 71° 11' 32" W*
>>
>>
>> [image: Description:
>>
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_360.gif]
>>
>><https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2F360.he
>>re.com%2F&data=01%7C01%7C%7C2119a5eb0cc44d34568008d56f36fa6d%7C01862c52f5
>>c249ce81db14f1034b32a4%7C1&sdata=Wp0Yq%2F4DVa9S%2F7oHa2QvxKfNLlAJXcH%2BSp
>>y4x00bB2k%3D&reserved=0>    [image: Description:
>>
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_Twitter.gif]
>>
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.t
>>witter.com%2Fhere&data=01%7C01%7C%7C2119a5eb0cc44d34568008d56f36fa6d%7C01
>>862c52f5c249ce81db14f1034b32a4%7C1&sdata=UIdP4GTX7I30zwso%2FNNBjPEX9Y%2B2
>>kLuCh8Otqal4E7g%3D&reserved=0>   [image: Description:
>>
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_FB.gif]
>>
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.f
>>acebook.com%2Fhere&data=01%7C01%7C%7C2119a5eb0cc44d34568008d56f36fa6d%7C0
>>1862c52f5c249ce81db14f1034b32a4%7C1&sdata=WzeJaypsV7zLdtOuAsYlURB1d27dFgf
>>fMEgrqm7VEJU%3D&reserved=0>    [image: Description:
>>
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_IN.gif]
>>
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.l
>>inkedin.com%2Fcompany%2Fheremaps&data=01%7C01%7C%7C2119a5eb0cc44d34568008
>>d56f36fa6d%7C01862c52f5c249ce81db14f1034b32a4%7C1&sdata=dBVm7pVhXSvpvziHu
>>cH87bUkCNrI4kuzBfRX8frTXI8%3D&reserved=0>    [image: Description:
>>
>>/Users/nussbaum/_WORK/PROJECTS/20160726_HERE_EMail_Signature/_Layout/_Ima
>>ges/20160726_HERE_EMail_Signature_Insta.gif]
>>
>><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.i
>>nstagram.com%2Fhere%2F&data=01%7C01%7C%7C2119a5eb0cc44d34568008d56f36fa6d
>>%7C01862c52f5c249ce81db14f1034b32a4%7C1&sdata=zN8Lcg%2FQu8B3Aiz7rnLJ0%2F0
>>D2A9ODNty4wsyZoo986w%3D&reserved=0>
>>