AddReplica to shard with lowest node count

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

AddReplica to shard with lowest node count

Duncan, Adam
Hi all,

Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to 7.3
Currently we have a working scale-up approach for adding a new server to the cluster beyond the initial collection creation.
We’ve automated the install of Solr on new servers and, following that, we register the new instance with zookeeper so that the server will be included in the list of live nodes.
Finally we use the CoreAdmin API ‘Create’ command to associate the new node with our collection. Solr 5.1's CoreAdmin Create command would conveniently auto-assign the new node to the shard with the least nodes.

In Solr 7.3, the CoreAdmin API documentation warns us not to use the Create command with SolrCloud.
We tried 7.3’s CoreAdmin API Create command regardless and, unsurprisingly, it did not work.
The 7.3 documentation suggests we use the Collections API AddReplica command.The problem with AddReplica is that it expects us to specify the shard name.
This is unfortunate as it makes it hard for us to keep shards balanced. It puts the onus on us to work out the least populated shard via a call to the cluster status endpoint.
With that we now face the problem managing this correctly when scaling up multiple servers at once.

Are we missing something here? Is there really no way for a node to be auto-assigned to a shard in 7.3?
And if so, are there any recommendations for an approach to reliably doing this ourselves?

Thanks!
Adam
Reply | Threaded
Open this post in threaded view
|

Re: AddReplica to shard with lowest node count

Gus Heck
Perhaps the rule based replica placement stuff would do the trick?

https://lucene.apache.org/solr/guide/7_3/rule-based-replica-placement.html

I haven't used it myself but I've seen lots of work going into it lately...

On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <[hidden email]>
wrote:

> Hi all,
>
> Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to 7.3
> Currently we have a working scale-up approach for adding a new server to
> the cluster beyond the initial collection creation.
> We’ve automated the install of Solr on new servers and, following that, we
> register the new instance with zookeeper so that the server will be
> included in the list of live nodes.
> Finally we use the CoreAdmin API ‘Create’ command to associate the new
> node with our collection. Solr 5.1's CoreAdmin Create command would
> conveniently auto-assign the new node to the shard with the least nodes.
>
> In Solr 7.3, the CoreAdmin API documentation warns us not to use the
> Create command with SolrCloud.
> We tried 7.3’s CoreAdmin API Create command regardless and,
> unsurprisingly, it did not work.
> The 7.3 documentation suggests we use the Collections API AddReplica
> command.The problem with AddReplica is that it expects us to specify the
> shard name.
> This is unfortunate as it makes it hard for us to keep shards balanced. It
> puts the onus on us to work out the least populated shard via a call to the
> cluster status endpoint.
> With that we now face the problem managing this correctly when scaling up
> multiple servers at once.
>
> Are we missing something here? Is there really no way for a node to be
> auto-assigned to a shard in 7.3?
> And if so, are there any recommendations for an approach to reliably doing
> this ourselves?
>
> Thanks!
> Adam
>



--
http://www.the111shift.com
Reply | Threaded
Open this post in threaded view
|

Re: AddReplica to shard with lowest node count

Shalin Shekhar Mangar
The rule based replica placement was deprecated. The autoscaling APIs are
the way to go. Please see
http://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling.html

Your use-case is interesting. By default, the trigger for nodeAdded event
will move replicas from the most loaded nodes to the new node. That does
not take care of your use-case. Can you please open a Jira to add this
feature?

On Thu, Jul 5, 2018 at 6:45 AM Gus Heck <[hidden email]> wrote:

> Perhaps the rule based replica placement stuff would do the trick?
>
> https://lucene.apache.org/solr/guide/7_3/rule-based-replica-placement.html
>
> I haven't used it myself but I've seen lots of work going into it lately...
>
> On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <[hidden email]>
> wrote:
>
> > Hi all,
> >
> > Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to
> 7.3
> > Currently we have a working scale-up approach for adding a new server to
> > the cluster beyond the initial collection creation.
> > We’ve automated the install of Solr on new servers and, following that,
> we
> > register the new instance with zookeeper so that the server will be
> > included in the list of live nodes.
> > Finally we use the CoreAdmin API ‘Create’ command to associate the new
> > node with our collection. Solr 5.1's CoreAdmin Create command would
> > conveniently auto-assign the new node to the shard with the least nodes.
> >
> > In Solr 7.3, the CoreAdmin API documentation warns us not to use the
> > Create command with SolrCloud.
> > We tried 7.3’s CoreAdmin API Create command regardless and,
> > unsurprisingly, it did not work.
> > The 7.3 documentation suggests we use the Collections API AddReplica
> > command.The problem with AddReplica is that it expects us to specify the
> > shard name.
> > This is unfortunate as it makes it hard for us to keep shards balanced.
> It
> > puts the onus on us to work out the least populated shard via a call to
> the
> > cluster status endpoint.
> > With that we now face the problem managing this correctly when scaling up
> > multiple servers at once.
> >
> > Are we missing something here? Is there really no way for a node to be
> > auto-assigned to a shard in 7.3?
> > And if so, are there any recommendations for an approach to reliably
> doing
> > this ourselves?
> >
> > Thanks!
> > Adam
> >
>
>
>
> --
> http://www.the111shift.com
>


--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: AddReplica to shard with lowest node count

Gus Heck
Ah hmm I guess I didn't realize the autoscaling didn't use the rule based
stuff (haven't had opportunity to work with either). If it's deprecated,
maybe that suggests we need a highly visible warning box on the ref guide
page?

On Thu, Jul 5, 2018 at 12:18 AM, Shalin Shekhar Mangar <
[hidden email]> wrote:

> The rule based replica placement was deprecated. The autoscaling APIs are
> the way to go. Please see
> http://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling.html
>
> Your use-case is interesting. By default, the trigger for nodeAdded event
> will move replicas from the most loaded nodes to the new node. That does
> not take care of your use-case. Can you please open a Jira to add this
> feature?
>
> On Thu, Jul 5, 2018 at 6:45 AM Gus Heck <[hidden email]> wrote:
>
> > Perhaps the rule based replica placement stuff would do the trick?
> >
> > https://lucene.apache.org/solr/guide/7_3/rule-based-
> replica-placement.html
> >
> > I haven't used it myself but I've seen lots of work going into it
> lately...
> >
> > On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <[hidden email]
> >
> > wrote:
> >
> > > Hi all,
> > >
> > > Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to
> > 7.3
> > > Currently we have a working scale-up approach for adding a new server
> to
> > > the cluster beyond the initial collection creation.
> > > We’ve automated the install of Solr on new servers and, following that,
> > we
> > > register the new instance with zookeeper so that the server will be
> > > included in the list of live nodes.
> > > Finally we use the CoreAdmin API ‘Create’ command to associate the new
> > > node with our collection. Solr 5.1's CoreAdmin Create command would
> > > conveniently auto-assign the new node to the shard with the least
> nodes.
> > >
> > > In Solr 7.3, the CoreAdmin API documentation warns us not to use the
> > > Create command with SolrCloud.
> > > We tried 7.3’s CoreAdmin API Create command regardless and,
> > > unsurprisingly, it did not work.
> > > The 7.3 documentation suggests we use the Collections API AddReplica
> > > command.The problem with AddReplica is that it expects us to specify
> the
> > > shard name.
> > > This is unfortunate as it makes it hard for us to keep shards balanced.
> > It
> > > puts the onus on us to work out the least populated shard via a call to
> > the
> > > cluster status endpoint.
> > > With that we now face the problem managing this correctly when scaling
> up
> > > multiple servers at once.
> > >
> > > Are we missing something here? Is there really no way for a node to be
> > > auto-assigned to a shard in 7.3?
> > > And if so, are there any recommendations for an approach to reliably
> > doing
> > > this ourselves?
> > >
> > > Thanks!
> > > Adam
> > >
> >
> >
> >
> > --
> > http://www.the111shift.com
> >
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



--
http://www.the111shift.com
Reply | Threaded
Open this post in threaded view
|

Re: AddReplica to shard with lowest node count

Duncan, Adam
In reply to this post by Shalin Shekhar Mangar
Thanks for your responses.

I’ve tried to get more familiar with the Autoscaling API. I’ve applied a nodeAdded trigger but I’m stuck trying to think of a cluster policy that would suit my scenario; something like “All new nodes need must have one replica from each available collection”
Is this possible? Or is that the point you were getting at by saying my use-case isn’t supported, Shalin?

Regards,
Adam

On 7/4/18, 9:18 PM, "Shalin Shekhar Mangar" <[hidden email]> wrote:

    The rule based replica placement was deprecated. The autoscaling APIs are
    the way to go. Please see
    http://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling.html
   
    Your use-case is interesting. By default, the trigger for nodeAdded event
    will move replicas from the most loaded nodes to the new node. That does
    not take care of your use-case. Can you please open a Jira to add this
    feature?
   
    On Thu, Jul 5, 2018 at 6:45 AM Gus Heck <[hidden email]> wrote:
   
    > Perhaps the rule based replica placement stuff would do the trick?
    >
    > https://lucene.apache.org/solr/guide/7_3/rule-based-replica-placement.html
    >
    > I haven't used it myself but I've seen lots of work going into it lately...
    >
    > On Wed, Jul 4, 2018 at 12:35 PM, Duncan, Adam <[hidden email]>
    > wrote:
    >
    > > Hi all,
    > >
    > > Our team use Solrcloud for Solr 5.1 and are investigating an upgrade to
    > 7.3
    > > Currently we have a working scale-up approach for adding a new server to
    > > the cluster beyond the initial collection creation.
    > > We’ve automated the install of Solr on new servers and, following that,
    > we
    > > register the new instance with zookeeper so that the server will be
    > > included in the list of live nodes.
    > > Finally we use the CoreAdmin API ‘Create’ command to associate the new
    > > node with our collection. Solr 5.1's CoreAdmin Create command would
    > > conveniently auto-assign the new node to the shard with the least nodes.
    > >
    > > In Solr 7.3, the CoreAdmin API documentation warns us not to use the
    > > Create command with SolrCloud.
    > > We tried 7.3’s CoreAdmin API Create command regardless and,
    > > unsurprisingly, it did not work.
    > > The 7.3 documentation suggests we use the Collections API AddReplica
    > > command.The problem with AddReplica is that it expects us to specify the
    > > shard name.
    > > This is unfortunate as it makes it hard for us to keep shards balanced.
    > It
    > > puts the onus on us to work out the least populated shard via a call to
    > the
    > > cluster status endpoint.
    > > With that we now face the problem managing this correctly when scaling up
    > > multiple servers at once.
    > >
    > > Are we missing something here? Is there really no way for a node to be
    > > auto-assigned to a shard in 7.3?
    > > And if so, are there any recommendations for an approach to reliably
    > doing
    > > this ourselves?
    > >
    > > Thanks!
    > > Adam
    > >
    >
    >
    >
    > --
    > http://www.the111shift.com
    >
   
   
    --
    Regards,
    Shalin Shekhar Mangar.