Snapshooting or replicating recently indexed data

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Snapshooting or replicating recently indexed data

Doss
Hi,

It seems the snapshooter  takes the exact copy of the indexed data, that is all the contents inside the index directory,  how can we take the recently added once?
...
cp -lr ${data_dir}/index ${temp}
mv ${temp} ${name} ...

Thanks,
Doss.
Reply | Threaded
Open this post in threaded view
|

Re: Snapshooting or replicating recently indexed data

Yonik Seeley-2
On 4/19/07, Doss <[hidden email]> wrote:
> It seems the snapshooter  takes the exact copy of the indexed data, that is all the contents inside the index directory,  how can we take the recently added once?
> ...
> cp -lr ${data_dir}/index ${temp}
> mv ${temp} ${name} ...


I don't quite understand your question, but since hard links are used,
it's more like pointing to the index files instead of copying them.
Rsync is used as a transport to only move the files that were changed
from the master to slaves.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: Snapshooting or replicating recently indexed data

Doss
Hi Yonik,

Thanks for your quick response, my question is this, can we take incremental
backup/replication in SOLR?

Regards,
Doss.


M. MOHANDOSS Software Engineer Ext: 507 (A BharatMatrimony Enterprise)
----- Original Message -----
From: "Yonik Seeley" <[hidden email]>
To: <[hidden email]>
Sent: Thursday, April 19, 2007 7:42 PM
Subject: Re: Snapshooting or replicating recently indexed data


> On 4/19/07, Doss <[hidden email]> wrote:
>> It seems the snapshooter  takes the exact copy of the indexed data, that
>> is all the contents inside the index directory,  how can we take the
>> recently added once?
>> ...
>> cp -lr ${data_dir}/index ${temp}
>> mv ${temp} ${name} ...
>
>
> I don't quite understand your question, but since hard links are used,
> it's more like pointing to the index files instead of copying them.
> Rsync is used as a transport to only move the files that were changed
> from the master to slaves.
>
> -Yonik

Reply | Threaded
Open this post in threaded view
|

Re: Snapshooting or replicating recently indexed data

Kevin Lewandowski
snapshooter does create incremental builds of the index. It doesn't
appear so if you look at the contents because the existing files are
hard links. But it is incremental.

On 4/20/07, Doss <[hidden email]> wrote:

> Hi Yonik,
>
> Thanks for your quick response, my question is this, can we take incremental
> backup/replication in SOLR?
>
> Regards,
> Doss.
>
>
> M. MOHANDOSS Software Engineer Ext: 507 (A BharatMatrimony Enterprise)
> ----- Original Message -----
> From: "Yonik Seeley" <[hidden email]>
> To: <[hidden email]>
> Sent: Thursday, April 19, 2007 7:42 PM
> Subject: Re: Snapshooting or replicating recently indexed data
>
>
> > On 4/19/07, Doss <[hidden email]> wrote:
> >> It seems the snapshooter  takes the exact copy of the indexed data, that
> >> is all the contents inside the index directory,  how can we take the
> >> recently added once?
> >> ...
> >> cp -lr ${data_dir}/index ${temp}
> >> mv ${temp} ${name} ...
> >
> >
> > I don't quite understand your question, but since hard links are used,
> > it's more like pointing to the index files instead of copying them.
> > Rsync is used as a transport to only move the files that were changed
> > from the master to slaves.
> >
> > -Yonik
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Snapshooting or replicating recently indexed data

Bill Au
Here's the Solr Wiki on collection distribution:

http://wiki.apache.org/solr/CollectionDistribution

It describes the "incremental" nature of the distribution:

A collection is a directory of many files. Collections are distributed
to the slaves as snapshots of these files. Each snapshot is made up of
hard links to the files so copying of the actual files is not
necessary when snapshots are created. Lucene only significantly
rewrites files following an optimization command. Generally, a file
once written, will change very little if at all. This makes the
underlying transport of rsync very useful. Files that have already
been transfered and have not changed do not need to be re-transferred
with the new edition of a collection.

Bill

On 4/21/07, Kevin Lewandowski <[hidden email]> wrote:

> snapshooter does create incremental builds of the index. It doesn't
> appear so if you look at the contents because the existing files are
> hard links. But it is incremental.
>
> On 4/20/07, Doss <[hidden email]> wrote:
> > Hi Yonik,
> >
> > Thanks for your quick response, my question is this, can we take incremental
> > backup/replication in SOLR?
> >
> > Regards,
> > Doss.
> >
> >
> > M. MOHANDOSS Software Engineer Ext: 507 (A BharatMatrimony Enterprise)
> > ----- Original Message -----
> > From: "Yonik Seeley" <[hidden email]>
> > To: <[hidden email]>
> > Sent: Thursday, April 19, 2007 7:42 PM
> > Subject: Re: Snapshooting or replicating recently indexed data
> >
> >
> > > On 4/19/07, Doss <[hidden email]> wrote:
> > >> It seems the snapshooter  takes the exact copy of the indexed data, that
> > >> is all the contents inside the index directory,  how can we take the
> > >> recently added once?
> > >> ...
> > >> cp -lr ${data_dir}/index ${temp}
> > >> mv ${temp} ${name} ...
> > >
> > >
> > > I don't quite understand your question, but since hard links are used,
> > > it's more like pointing to the index files instead of copying them.
> > > Rsync is used as a transport to only move the files that were changed
> > > from the master to slaves.
> > >
> > > -Yonik
> >
> >
>