Backup and distributed index/backup management

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Backup and distributed index/backup management

al patel
Hi:

I am novice to solr in terms of backup/operations.

We have a single instance of master (solr) working well, I tried the backup
scripts etc and could get things working fine.

My question is, even with backup, solr will still have a single index,
right? We will have huge amount of data in index - it is ever increasing.

I want to archive older data - say every 2 weeks and start a new index - but
want the older indices to be searchable.

I can potentially take a snapshot at master at 2 week interval, backup and
restart master with fresh index.

On the slaves, where the actual searches happen, how do I deal with things -
won't there be multiple indices there then?

Does solr handle this - how? Or how do I solve this problem? Open to other
suggestions too.

Best Regards
-al
Reply | Threaded
Open this post in threaded view
|

Re: Backup and distributed index/backup management

al patel
Reposting :)

Hi:

>
> I am novice to solr in terms of backup/operations.
>
> We have a single instance of master (solr) working well, I tried the
> backup scripts etc and could get things working fine.
>
> My question is, even with backup, solr will still have a single index,
> right? We will have huge amount of data in index - it is ever increasing.
>
> I want to archive older data - say every 2 weeks and start a new index -
> but want the older indices to be searchable.
>
> I can potentially take a snapshot at master at 2 week interval, backup and
> restart master with fresh index.
>
> On the slaves, where the actual searches happen, how do I deal with things
> - won't there be multiple indices there then?
>
> Does solr handle this - how? Or how do I solve this problem? Open to other
> suggestions too.
>
> Best Regards
> -al
>
Reply | Threaded
Open this post in threaded view
|

Re: Backup and distributed index/backup management

Chris Hostetter-3
In reply to this post by al patel

: My question is, even with backup, solr will still have a single index,
: right? We will have huge amount of data in index - it is ever increasing.

if you have older docs you want to retire out of your index, you'll need
to do that manually (delete by query can come in handy)

: I want to archive older data - say every 2 weeks and start a new index - but
: want the older indices to be searchable.
:
: I can potentially take a snapshot at master at 2 week interval, backup and
: restart master with fresh index.

you don't really need to restart the master ... you could pull snapshots
from your master to a slave, and then when you decide that slave is "full"
of old docs you stop pulling snapshots, and delete the old docs from your
master and start replicating to a new slave.

: Does solr handle this - how? Or how do I solve this problem? Open to other
: suggestions too.

what you're describing is fairly outside of what i would consider "normal"
Solr usage .. it seems very special purpose.



-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: Backup and distributed index/backup management

al patel
Thanks Chris.

So, looks like then one has to delete entries to keep the index managable
then.

In my case, we need to preserve entries - thus, wanted to "archive"
snapshots, but still keep them searchable (thaw certain indices if you may).


So, is there anyone out there looking into "ever increasing index sizes" and
having to maintain older data?

Rgds
-a

On 3/24/07, Chris Hostetter <[hidden email]> wrote:

>
>
> : My question is, even with backup, solr will still have a single index,
> : right? We will have huge amount of data in index - it is ever
> increasing.
>
> if you have older docs you want to retire out of your index, you'll need
> to do that manually (delete by query can come in handy)
>
> : I want to archive older data - say every 2 weeks and start a new index -
> but
> : want the older indices to be searchable.
> :
> : I can potentially take a snapshot at master at 2 week interval, backup
> and
> : restart master with fresh index.
>
> you don't really need to restart the master ... you could pull snapshots
> from your master to a slave, and then when you decide that slave is "full"
> of old docs you stop pulling snapshots, and delete the old docs from your
> master and start replicating to a new slave.
>
> : Does solr handle this - how? Or how do I solve this problem? Open to
> other
> : suggestions too.
>
> what you're describing is fairly outside of what i would consider "normal"
> Solr usage .. it seems very special purpose.
>
>
>
> -Hoss
>
>