Per Shard Replication Factor

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Per Shard Replication Factor

Steven Bower
Is it currently possible to have per-shard replication factor?

A bit of background on the use case...

If you are hashing content to shards by a known factor (lets say date
ranges, 12 shards, 1 per month) it might be the case that most of your
search traffic would be directed to one particular shard (eg. the current
month shard) and having increased query capacity in that shard would be
useful... this could be extended to many use cases such as data hashed by
organization, type, etc.

Thanks,

steve
Reply | Threaded
Open this post in threaded view
|

Re: Per Shard Replication Factor

Otis Gospodnetić
Could these just be different collections? Then sharding and replication is
independent.  And you can reduce replication factor as the index ages.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 9, 2013 1:43 AM, "Steven Bower" <[hidden email]> wrote:

> Is it currently possible to have per-shard replication factor?
>
> A bit of background on the use case...
>
> If you are hashing content to shards by a known factor (lets say date
> ranges, 12 shards, 1 per month) it might be the case that most of your
> search traffic would be directed to one particular shard (eg. the current
> month shard) and having increased query capacity in that shard would be
> useful... this could be extended to many use cases such as data hashed by
> organization, type, etc.
>
> Thanks,
>
> steve
>
Reply | Threaded
Open this post in threaded view
|

Re: Per Shard Replication Factor

Steven Bower-2
This approach would work to satisfy the requirement but I think would
generally be nice to have the ability to control this within a single
collection (so you don't give up any functionality when querying between
the collections and to make the management of the system easier).

Anyway I'll create a ticket and take a look at how this might work..

steve


On Thu, May 9, 2013 at 8:23 PM, Otis Gospodnetic <[hidden email]
> wrote:

> Could these just be different collections? Then sharding and replication is
> independent.  And you can reduce replication factor as the index ages.
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On May 9, 2013 1:43 AM, "Steven Bower" <[hidden email]> wrote:
>
> > Is it currently possible to have per-shard replication factor?
> >
> > A bit of background on the use case...
> >
> > If you are hashing content to shards by a known factor (lets say date
> > ranges, 12 shards, 1 per month) it might be the case that most of your
> > search traffic would be directed to one particular shard (eg. the current
> > month shard) and having increased query capacity in that shard would be
> > useful... this could be extended to many use cases such as data hashed by
> > organization, type, etc.
> >
> > Thanks,
> >
> > steve
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Per Shard Replication Factor

Joel Bernstein
I agree this would be a nice feature. Steven can update this thread with
ticket? Thanks Joel


On Fri, May 10, 2013 at 9:58 AM, Steven Bower <[hidden email]> wrote:

> This approach would work to satisfy the requirement but I think would
> generally be nice to have the ability to control this within a single
> collection (so you don't give up any functionality when querying between
> the collections and to make the management of the system easier).
>
> Anyway I'll create a ticket and take a look at how this might work..
>
> steve
>
>
> On Thu, May 9, 2013 at 8:23 PM, Otis Gospodnetic <
> [hidden email]
> > wrote:
>
> > Could these just be different collections? Then sharding and replication
> is
> > independent.  And you can reduce replication factor as the index ages.
> >
> > Otis
> > Solr & ElasticSearch Support
> > http://sematext.com/
> > On May 9, 2013 1:43 AM, "Steven Bower" <[hidden email]> wrote:
> >
> > > Is it currently possible to have per-shard replication factor?
> > >
> > > A bit of background on the use case...
> > >
> > > If you are hashing content to shards by a known factor (lets say date
> > > ranges, 12 shards, 1 per month) it might be the case that most of your
> > > search traffic would be directed to one particular shard (eg. the
> current
> > > month shard) and having increased query capacity in that shard would be
> > > useful... this could be extended to many use cases such as data hashed
> by
> > > organization, type, etc.
> > >
> > > Thanks,
> > >
> > > steve
> > >
> >
>



--
Joel Bernstein
Professional Services LucidWorks
Reply | Threaded
Open this post in threaded view
|

Re: Per Shard Replication Factor

Shalin Shekhar Mangar
There's an issue already:
https://issues.apache.org/jira/browse/SOLR-4808


On Fri, May 10, 2013 at 11:50 AM, Joel Bernstein <[hidden email]> wrote:

> I agree this would be a nice feature. Steven can update this thread with
> ticket? Thanks Joel
>
>
> On Fri, May 10, 2013 at 9:58 AM, Steven Bower <[hidden email]> wrote:
>
> > This approach would work to satisfy the requirement but I think would
> > generally be nice to have the ability to control this within a single
> > collection (so you don't give up any functionality when querying between
> > the collections and to make the management of the system easier).
> >
> > Anyway I'll create a ticket and take a look at how this might work..
> >
> > steve
> >
> >
> > On Thu, May 9, 2013 at 8:23 PM, Otis Gospodnetic <
> > [hidden email]
> > > wrote:
> >
> > > Could these just be different collections? Then sharding and
> replication
> > is
> > > independent.  And you can reduce replication factor as the index ages.
> > >
> > > Otis
> > > Solr & ElasticSearch Support
> > > http://sematext.com/
> > > On May 9, 2013 1:43 AM, "Steven Bower" <[hidden email]> wrote:
> > >
> > > > Is it currently possible to have per-shard replication factor?
> > > >
> > > > A bit of background on the use case...
> > > >
> > > > If you are hashing content to shards by a known factor (lets say date
> > > > ranges, 12 shards, 1 per month) it might be the case that most of
> your
> > > > search traffic would be directed to one particular shard (eg. the
> > current
> > > > month shard) and having increased query capacity in that shard would
> be
> > > > useful... this could be extended to many use cases such as data
> hashed
> > by
> > > > organization, type, etc.
> > > >
> > > > Thanks,
> > > >
> > > > steve
> > > >
> > >
> >
>
>
>
> --
> Joel Bernstein
> Professional Services LucidWorks
>



--
Regards,
Shalin Shekhar Mangar.