Zookeeper and Solr Clients

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Zookeeper and Solr Clients

Malcolm Upayavira Holmes
I've recently had a patch merged into Pysolr that adds ZK awareness
(compatible with custerstate.json). Now I need to update it to be
compatible with the newer state.json, and I just wanted to confirm my
understanding....

If we create a Python 'client' that is tied to a specific collection,
then all I need to do is set up a watch on
/collections/${collection}/state.json, and update the list of nodes
accordingly (as I would have on a watch on clusterstate.json) when
state.json changes.

There's a lot more that *could* be done, but for the basics, it seems
that's enough.

Is it really this simple?

Upayavira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper and Solr Clients

Scott Blum
You probably also want a child watch on live_nodes to monitor connected nodes.

On Thu, Feb 25, 2016 at 11:12 AM, Upayavira <[hidden email]> wrote:
I've recently had a patch merged into Pysolr that adds ZK awareness
(compatible with custerstate.json). Now I need to update it to be
compatible with the newer state.json, and I just wanted to confirm my
understanding....

If we create a Python 'client' that is tied to a specific collection,
then all I need to do is set up a watch on
/collections/${collection}/state.json, and update the list of nodes
accordingly (as I would have on a watch on clusterstate.json) when
state.json changes.

There's a lot more that *could* be done, but for the basics, it seems
that's enough.

Is it really this simple?

Upayavira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper and Solr Clients

Malcolm Upayavira Holmes
How does that help me? The live_nodes watch tells me when nodes go up and down, but surely I should be waiting for the overseer to do the same and update state.json. I'd just want live_nodes watched for situations where someone wants to do, say, collections API calls that aren't specific to a collection, I presume.
 
Upayavira
 
 
On Thu, Feb 25, 2016, at 04:45 PM, Scott Blum wrote:
You probably also want a child watch on live_nodes to monitor connected nodes.
 
On Thu, Feb 25, 2016 at 11:12 AM, Upayavira <[hidden email]> wrote:
I've recently had a patch merged into Pysolr that adds ZK awareness
(compatible with custerstate.json). Now I need to update it to be
compatible with the newer state.json, and I just wanted to confirm my
understanding....
 
If we create a Python 'client' that is tied to a specific collection,
then all I need to do is set up a watch on
/collections/${collection}/state.json, and update the list of nodes
accordingly (as I would have on a watch on clusterstate.json) when
state.json changes.
 
There's a lot more that *could* be done, but for the basics, it seems
that's enough.
 
Is it really this simple?
 
Upayavira
 
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper and Solr Clients

Noble Paul നോബിള്‍  नोब्ळ्
In reply to this post by Malcolm Upayavira Holmes
why do you need to watch anything? you can get the whole clusterstate
using the API. ZK access is not required

On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <[hidden email]> wrote:

> I've recently had a patch merged into Pysolr that adds ZK awareness
> (compatible with custerstate.json). Now I need to update it to be
> compatible with the newer state.json, and I just wanted to confirm my
> understanding....
>
> If we create a Python 'client' that is tied to a specific collection,
> then all I need to do is set up a watch on
> /collections/${collection}/state.json, and update the list of nodes
> accordingly (as I would have on a watch on clusterstate.json) when
> state.json changes.
>
> There's a lot more that *could* be done, but for the basics, it seems
> that's enough.
>
> Is it really this simple?
>
> Upayavira
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>



--
-----------------------------------------------------
Noble Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper and Solr Clients

Malcolm Upayavira Holmes
This is for making a ZK aware Pysolr client (i.e. Python equiv of SolrJ
CloudSolrClient). It clearly needs to watch ZK to be able to update the
list of hosts that make up a collection. We can't use the API, because
we don't yet know where the Solr nodes are!

Upayavira

On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:

> why do you need to watch anything? you can get the whole clusterstate
> using the API. ZK access is not required
>
> On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <[hidden email]> wrote:
> > I've recently had a patch merged into Pysolr that adds ZK awareness
> > (compatible with custerstate.json). Now I need to update it to be
> > compatible with the newer state.json, and I just wanted to confirm my
> > understanding....
> >
> > If we create a Python 'client' that is tied to a specific collection,
> > then all I need to do is set up a watch on
> > /collections/${collection}/state.json, and update the list of nodes
> > accordingly (as I would have on a watch on clusterstate.json) when
> > state.json changes.
> >
> > There's a lot more that *could* be done, but for the basics, it seems
> > that's enough.
> >
> > Is it really this simple?
> >
> > Upayavira
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper and Solr Clients

Scott Blum
Published cluster state always lags.  And if a solr node crashes, the status on affected replicas won't actually change until the owning instances tries to come back up.  If you're working on a generally reusable library, you'd want to also watch live_nodes.

On Fri, Feb 26, 2016 at 5:23 AM, Upayavira <[hidden email]> wrote:
This is for making a ZK aware Pysolr client (i.e. Python equiv of SolrJ
CloudSolrClient). It clearly needs to watch ZK to be able to update the
list of hosts that make up a collection. We can't use the API, because
we don't yet know where the Solr nodes are!

Upayavira

On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:
> why do you need to watch anything? you can get the whole clusterstate
> using the API. ZK access is not required
>
> On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <[hidden email]> wrote:
> > I've recently had a patch merged into Pysolr that adds ZK awareness
> > (compatible with custerstate.json). Now I need to update it to be
> > compatible with the newer state.json, and I just wanted to confirm my
> > understanding....
> >
> > If we create a Python 'client' that is tied to a specific collection,
> > then all I need to do is set up a watch on
> > /collections/${collection}/state.json, and update the list of nodes
> > accordingly (as I would have on a watch on clusterstate.json) when
> > state.json changes.
> >
> > There's a lot more that *could* be done, but for the basics, it seems
> > that's enough.
> >
> > Is it really this simple?
> >
> > Upayavira
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper and Solr Clients

Mark Miller-3
Right, clusterstate.json is never used by itself to determine a replica's state. It's always, is it live? Then find it's state in clusterstate.json. If it's not live, the state can be anything in clusterstate.json and should be ignored.

We make some best efforts to keep it up to date, but it should not be counted on (and can't always be counted on), and the above logic is how all Solr code reads state.

- Mark

On Fri, Feb 26, 2016 at 2:13 PM Scott Blum <[hidden email]> wrote:
Published cluster state always lags.  And if a solr node crashes, the status on affected replicas won't actually change until the owning instances tries to come back up.  If you're working on a generally reusable library, you'd want to also watch live_nodes.

On Fri, Feb 26, 2016 at 5:23 AM, Upayavira <[hidden email]> wrote:
This is for making a ZK aware Pysolr client (i.e. Python equiv of SolrJ
CloudSolrClient). It clearly needs to watch ZK to be able to update the
list of hosts that make up a collection. We can't use the API, because
we don't yet know where the Solr nodes are!

Upayavira

On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:
> why do you need to watch anything? you can get the whole clusterstate
> using the API. ZK access is not required
>
> On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <[hidden email]> wrote:
> > I've recently had a patch merged into Pysolr that adds ZK awareness
> > (compatible with custerstate.json). Now I need to update it to be
> > compatible with the newer state.json, and I just wanted to confirm my
> > understanding....
> >
> > If we create a Python 'client' that is tied to a specific collection,
> > then all I need to do is set up a watch on
> > /collections/${collection}/state.json, and update the list of nodes
> > accordingly (as I would have on a watch on clusterstate.json) when
> > state.json changes.
> >
> > There's a lot more that *could* be done, but for the basics, it seems
> > that's enough.
> >
> > Is it really this simple?
> >
> > Upayavira
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


--
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper and Solr Clients

Malcolm Upayavira Holmes
Perfect. So, if when I want to find a node to talk to, I do:
 * locate state.json or clusterstate.json
 * identify a suitable node
 * confirm the node is life, and if not repeat from previous step
 
Then I should be good.
 
Upayavira
 
 
On Fri, Feb 26, 2016, at 08:32 PM, Mark Miller wrote:
Right, clusterstate.json is never used by itself to determine a replica's state. It's always, is it live? Then find it's state in clusterstate.json. If it's not live, the state can be anything in clusterstate.json and should be ignored.
 
We make some best efforts to keep it up to date, but it should not be counted on (and can't always be counted on), and the above logic is how all Solr code reads state.
 
- Mark
 
On Fri, Feb 26, 2016 at 2:13 PM Scott Blum <[hidden email]> wrote:
Published cluster state always lags.  And if a solr node crashes, the status on affected replicas won't actually change until the owning instances tries to come back up.  If you're working on a generally reusable library, you'd want to also watch live_nodes.
 
On Fri, Feb 26, 2016 at 5:23 AM, Upayavira <[hidden email]> wrote:
This is for making a ZK aware Pysolr client (i.e. Python equiv of SolrJ
CloudSolrClient). It clearly needs to watch ZK to be able to update the
list of hosts that make up a collection. We can't use the API, because
we don't yet know where the Solr nodes are!

Upayavira
 
On Fri, Feb 26, 2016, at 09:09 AM, Noble Paul wrote:
> why do you need to watch anything? you can get the whole clusterstate
> using the API. ZK access is not required
>
> On Thu, Feb 25, 2016 at 9:42 PM, Upayavira <[hidden email]> wrote:
> > I've recently had a patch merged into Pysolr that adds ZK awareness
> > (compatible with custerstate.json). Now I need to update it to be
> > compatible with the newer state.json, and I just wanted to confirm my
> > understanding....
> >
> > If we create a Python 'client' that is tied to a specific collection,
> > then all I need to do is set up a watch on
> > /collections/${collection}/state.json, and update the list of nodes
> > accordingly (as I would have on a watch on clusterstate.json) when
> > state.json changes.
> >
> > There's a lot more that *could* be done, but for the basics, it seems
> > that's enough.
> >
> > Is it really this simple?
> >
> > Upayavira
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
 
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]
 
--