Collection API Snapshots and Restore

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Collection API Snapshots and Restore

Nicolas Bélisle
It is possible to restore a snapshot without exporting it using the
Collection API ?

We have a very large index and want to provide a quick option for restore
(instead of exporting + restoring + fixing the topology).

I think about developing a script that would read the snapshot's XML
description and manually delete files not part of that snapshot on each
core.

Is there another solution ?

Nicolas
Reply | Threaded
Open this post in threaded view
|

Re: Collection API Snapshots and Restore

Erick Erickson
I really have no idea how that would work. If you don't copy the index
somewhere, you simply can't restore it. If you're thinking about
selectively copying index files out then back in, you'll have to
somehow lock it at point X until your delta-copy is done.

"read the snapshot's XML description and manually delete files not
part of that snapshot on each core."

How would that deal with segment files that have been added?

Best,
Erick

On Fri, Feb 9, 2018 at 9:56 PM, Nicolas Bélisle
<[hidden email]> wrote:

> It is possible to restore a snapshot without exporting it using the
> Collection API ?
>
> We have a very large index and want to provide a quick option for restore
> (instead of exporting + restoring + fixing the topology).
>
> I think about developing a script that would read the snapshot's XML
> description and manually delete files not part of that snapshot on each
> core.
>
> Is there another solution ?
>
> Nicolas
Reply | Threaded
Open this post in threaded view
|

Re: Collection API Snapshots and Restore

Nicolas Bélisle
Thanks for the quick response.

I tried the following and it works.

1. Take a snapshot of your index :
http://localhost:8983/solr/admin/collections?action=CREATESNAPSHOT&collection=
<collectionName>&commitName=backup1
2. Add a few documents.
3. List the files for the snapshot :
http://localhost:8983/solr/admin/collections?action=LISTSNAPSHOTS&collection=
<collectionName>

Example :
<lst name="records_shard2_replica2">
  <str name="core">records_shard2_replica2</str>
  <str name="indexDirPath">

C:\apps\solrCloud\solr\server-3\server\solr\records_shard2_replica2\data\index/
  </str>
  <long name="generation">112</long>
  <str name="shard_id">shard2</str>
  <bool name="leader">false</bool>
  <arr name="files">
    <str>_153_Lucene50_0.pos</str>
    <str>segments_34</str>
    <str>_153.fnm</str>
    <str>_153_Lucene54_0.dvm</str>
    <str>_153_Lucene50_0.doc</str>
    <str>_153_Lucene50_0.tim</str>
    <str>_153.si</str>
    <str>_153_Lucene54_0.dvd</str>
    <str>_153_Lucene50_0.tip</str>
    <str>_153.fdx</str>
    <str>_153.fdt</str>
  </arr>
</lst>

4. Stop Solr

5. For each core in your collection, in the folder "indexDirPath" delete
all the files except the ones listed in the snapshot.

6. Restart Solr

7. The index is back at step #1.

Why it's interesting :
- It's a lot faster (a few seconds) than exporting and re-importing the
snapshot. Especially for large indexes.
- It's does not change the topology.

However, Solr needs to be shutdown.

My questions :
- Do you see flaws / risks to this strategy ?
- Could it be done without shutting down Solr ? * If so, we would like to
contribute this feature.

Regards,

Nicolas



On Sat, Feb 10, 2018 at 1:33 PM, Erick Erickson <[hidden email]>
wrote:

> I really have no idea how that would work. If you don't copy the index
> somewhere, you simply can't restore it. If you're thinking about
> selectively copying index files out then back in, you'll have to
> somehow lock it at point X until your delta-copy is done.
>
> "read the snapshot's XML description and manually delete files not
> part of that snapshot on each core."
>
> How would that deal with segment files that have been added?
>
> Best,
> Erick
>
> On Fri, Feb 9, 2018 at 9:56 PM, Nicolas Bélisle
> <[hidden email]> wrote:
> > It is possible to restore a snapshot without exporting it using the
> > Collection API ?
> >
> > We have a very large index and want to provide a quick option for restore
> > (instead of exporting + restoring + fixing the topology).
> >
> > I think about developing a script that would read the snapshot's XML
> > description and manually delete files not part of that snapshot on each
> > core.
> >
> > Is there another solution ?
> >
> > Nicolas
>