Zookeeper setup for solr cloud

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Zookeeper setup for solr cloud

varun srivastava
Hi,
 I would like to get recommendation on zookeeper ensemble architecture. I
am thinking of following options, please let me know if I am correct in
pros and con of each option. Also please feel free to add differentiating
points I am missing.

1) Have separate boxes for zookeeper ensemble and all the solrcloud
instances access it on runtime.
  Pros: Small set of zookeeper instances to maintain. May be sync up
between zookeeper boxes will be fast and reliable.

2) Let each solr box have zookeeper instance also. Each solr instance
accessing the localhost zookeeper.
   Pros: solr will not incur over the wire cost at runtime, hence should be
fast. More fault tolerant as solr not going over the wire to access
zookeeper.
   Con: Lots of zookeeper instances and hence may be slow to update.


Thanks
Varun
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper setup for solr cloud

Lance Norskog-2
You can find Solr information with this:
http://find.searchhub.org/?q=zookeeper+cluster

http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrCloud


----- Original Message -----
| From: "varun srivastava" <[hidden email]>
| To: [hidden email]
| Sent: Saturday, September 29, 2012 9:38:16 PM
| Subject: Zookeeper setup for solr cloud
|
| Hi,
|  I would like to get recommendation on zookeeper ensemble
|  architecture. I
| am thinking of following options, please let me know if I am correct
| in
| pros and con of each option. Also please feel free to add
| differentiating
| points I am missing.
|
| 1) Have separate boxes for zookeeper ensemble and all the solrcloud
| instances access it on runtime.
|   Pros: Small set of zookeeper instances to maintain. May be sync up
| between zookeeper boxes will be fast and reliable.
|
| 2) Let each solr box have zookeeper instance also. Each solr instance
| accessing the localhost zookeeper.
|    Pros: solr will not incur over the wire cost at runtime, hence
|    should be
| fast. More fault tolerant as solr not going over the wire to access
| zookeeper.
|    Con: Lots of zookeeper instances and hence may be slow to update.
|
|
| Thanks
| Varun
|
Reply | Threaded
Open this post in threaded view
|

Re: Zookeeper setup for solr cloud

varun srivastava
Hi,
  Rephrasing my question ... Let me know if anyone feel some problem with
following deployment of solrcloud

1) Have 200 solrcloud nodes ( serv1, serv2, .. serv200) with each machine
having both zookeeper and solr both.
2) zookeeper config contain the list of all servers

server.1=serv1:2888:3888
server.2=serv2:2888:3888

...
server.200=serv200:2888:3888


3) Each solrconfig only talks to localhost zookeeper -

 -DzkHost=localhost:9983


Thanks
Varun



On Sun, Sep 30, 2012 at 4:51 PM, Lance Norskog <[hidden email]> wrote:

> You can find Solr information with this:
> http://find.searchhub.org/?q=zookeeper+cluster
>
> http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrCloud
>
>
> ----- Original Message -----
> | From: "varun srivastava" <[hidden email]>
> | To: [hidden email]
> | Sent: Saturday, September 29, 2012 9:38:16 PM
> | Subject: Zookeeper setup for solr cloud
> |
> | Hi,
> |  I would like to get recommendation on zookeeper ensemble
> |  architecture. I
> | am thinking of following options, please let me know if I am correct
> | in
> | pros and con of each option. Also please feel free to add
> | differentiating
> | points I am missing.
> |
> | 1) Have separate boxes for zookeeper ensemble and all the solrcloud
> | instances access it on runtime.
> |   Pros: Small set of zookeeper instances to maintain. May be sync up
> | between zookeeper boxes will be fast and reliable.
> |
> | 2) Let each solr box have zookeeper instance also. Each solr instance
> | accessing the localhost zookeeper.
> |    Pros: solr will not incur over the wire cost at runtime, hence
> |    should be
> | fast. More fault tolerant as solr not going over the wire to access
> | zookeeper.
> |    Con: Lots of zookeeper instances and hence may be slow to update.
> |
> |
> | Thanks
> | Varun
> |
>
Reply | Threaded
Open this post in threaded view
|

RE: Zookeeper setup for solr cloud

Markus Jelsma-2
Hi Varun,

Running many Zookeeper instances improves read time but has a negative impact on writing states to Zookeeper. Having a node only talk to the local Zookeeper instance limits availability, your Zookeeper daemon will die at some point and that will cut off your Solr node from the entire cluster. Running so many Zookeper daemons is also a waste of resources, mostly RAM which you should use for your mmapped files for Solr.

As a minimum you must run three Zookeeper daemons in the network, but never an even amount because it won't have any positive effect on the quorum that Zookeeper needs. In your case i would start with five or seven daemons spread across the network, not sharing virtual machines and if possible not sharing switches.

Cheers,
Markus
 
-----Original message-----

> From:varun srivastava <[hidden email]>
> Sent: Mon 01-Oct-2012 23:56
> To: [hidden email]
> Subject: Re: Zookeeper setup for solr cloud
>
> Hi,
>   Rephrasing my question ... Let me know if anyone feel some problem with
> following deployment of solrcloud
>
> 1) Have 200 solrcloud nodes ( serv1, serv2, .. serv200) with each machine
> having both zookeeper and solr both.
> 2) zookeeper config contain the list of all servers
>
> server.1=serv1:2888:3888
> server.2=serv2:2888:3888
>
> ...
> server.200=serv200:2888:3888
>
>
> 3) Each solrconfig only talks to localhost zookeeper -
>
>  -DzkHost=localhost:9983
>
>
> Thanks
> Varun
>
>
>
> On Sun, Sep 30, 2012 at 4:51 PM, Lance Norskog <[hidden email]> wrote:
>
> > You can find Solr information with this:
> > http://find.searchhub.org/?q=zookeeper+cluster
> >
> > http://find.searchhub.org/link?url=http://wiki.apache.org/solr/SolrCloud
> >
> >
> > ----- Original Message -----
> > | From: "varun srivastava" <[hidden email]>
> > | To: [hidden email]
> > | Sent: Saturday, September 29, 2012 9:38:16 PM
> > | Subject: Zookeeper setup for solr cloud
> > |
> > | Hi,
> > |  I would like to get recommendation on zookeeper ensemble
> > |  architecture. I
> > | am thinking of following options, please let me know if I am correct
> > | in
> > | pros and con of each option. Also please feel free to add
> > | differentiating
> > | points I am missing.
> > |
> > | 1) Have separate boxes for zookeeper ensemble and all the solrcloud
> > | instances access it on runtime.
> > |   Pros: Small set of zookeeper instances to maintain. May be sync up
> > | between zookeeper boxes will be fast and reliable.
> > |
> > | 2) Let each solr box have zookeeper instance also. Each solr instance
> > | accessing the localhost zookeeper.
> > |    Pros: solr will not incur over the wire cost at runtime, hence
> > |    should be
> > | fast. More fault tolerant as solr not going over the wire to access
> > | zookeeper.
> > |    Con: Lots of zookeeper instances and hence may be slow to update.
> > |
> > |
> > | Thanks
> > | Varun
> > |
> >
>