how to access solr in solrcloud

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

how to access solr in solrcloud

Gu, Steve (CDC/CDC OD/OADS) (CTR)
Hi, all

I am upgrading our solr to 7.4 and would like to set up solrcloud for failover and load balance.   There are three zookeeper servers (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the solr url should the client to use for access?  Will it be solr1:8983, the leader?

If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?  Will the request be routed to solr2:8983 via the zookeeper?  I understand that zookeeper is doing all the coordination works but wanted to understand how this works.

Any insight would be greatly appreciated.
Steve

Reply | Threaded
Open this post in threaded view
|

RE: how to access solr in solrcloud

Вадим Иванов
Hi,  Steve
If you are using  solr1:8983 to access solr and solr1 is down IMHO nothing
helps you to access dead ip.
You should switch to any other live node in the cluster or I'd propose to
have nginx as frontend to access
Solrcloud.

--
BR, Vadim



-----Original Message-----
From: Gu, Steve (CDC/OD/OADS) (CTR) [mailto:[hidden email]]
Sent: Wednesday, September 12, 2018 4:38 PM
To: '[hidden email]'
Subject: how to access solr in solrcloud

Hi, all

I am upgrading our solr to 7.4 and would like to set up solrcloud for
failover and load balance.   There are three zookeeper servers (zk1:2181,
zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the
solr url should the client to use for access?  Will it be solr1:8983, the
leader?

If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?
Will the request be routed to solr2:8983 via the zookeeper?  I understand
that zookeeper is doing all the coordination works but wanted to understand
how this works.

Any insight would be greatly appreciated.
Steve


Reply | Threaded
Open this post in threaded view
|

Re: how to access solr in solrcloud

David Santamauro
... or haproxy.

On 9/12/18, 10:23 AM, "Vadim Ivanov" <[hidden email]> wrote:

    Hi,  Steve
    If you are using  solr1:8983 to access solr and solr1 is down IMHO nothing
    helps you to access dead ip.
    You should switch to any other live node in the cluster or I'd propose to
    have nginx as frontend to access
    Solrcloud.
   
    --
    BR, Vadim
   
   
   
    -----Original Message-----
    From: Gu, Steve (CDC/OD/OADS) (CTR) [mailto:[hidden email]]
    Sent: Wednesday, September 12, 2018 4:38 PM
    To: '[hidden email]'
    Subject: how to access solr in solrcloud
   
    Hi, all
   
    I am upgrading our solr to 7.4 and would like to set up solrcloud for
    failover and load balance.   There are three zookeeper servers (zk1:2181,
    zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the
    solr url should the client to use for access?  Will it be solr1:8983, the
    leader?
   
    If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?
    Will the request be routed to solr2:8983 via the zookeeper?  I understand
    that zookeeper is doing all the coordination works but wanted to understand
    how this works.
   
    Any insight would be greatly appreciated.
    Steve
   
   
   
Reply | Threaded
Open this post in threaded view
|

Re: how to access solr in solrcloud

Walter Underwood
In reply to this post by Gu, Steve (CDC/CDC OD/OADS) (CTR)
Use a load balancer. It doesn’t have to be fancy, we use the Amazon ALB because our clusters are in AWS.

Zookeeper never handles queries. It coordinates cluster changes with the Solr instances.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Sep 12, 2018, at 6:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) <[hidden email]> wrote:
>
> Hi, all
>
> I am upgrading our solr to 7.4 and would like to set up solrcloud for failover and load balance.   There are three zookeeper servers (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the solr url should the client to use for access?  Will it be solr1:8983, the leader?
>
> If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?  Will the request be routed to solr2:8983 via the zookeeper?  I understand that zookeeper is doing all the coordination works but wanted to understand how this works.
>
> Any insight would be greatly appreciated.
> Steve
>

Reply | Threaded
Open this post in threaded view
|

RE: how to access solr in solrcloud

Gu, Steve (CDC/CDC OD/OADS) (CTR)
In reply to this post by David Santamauro
Thanks, David

-----Original Message-----
From: David Santamauro <[hidden email]>
Sent: Wednesday, September 12, 2018 10:28 AM
To: [hidden email]
Cc: David Santamauro <[hidden email]>
Subject: Re: how to access solr in solrcloud

... or haproxy.

On 9/12/18, 10:23 AM, "Vadim Ivanov" <[hidden email]> wrote:

    Hi,  Steve
    If you are using  solr1:8983 to access solr and solr1 is down IMHO nothing
    helps you to access dead ip.
    You should switch to any other live node in the cluster or I'd propose to
    have nginx as frontend to access
    Solrcloud.
   
    --
    BR, Vadim
   
   
   
    -----Original Message-----
    From: Gu, Steve (CDC/OD/OADS) (CTR) [mailto:[hidden email]]
    Sent: Wednesday, September 12, 2018 4:38 PM
    To: '[hidden email]'
    Subject: how to access solr in solrcloud
   
    Hi, all
   
    I am upgrading our solr to 7.4 and would like to set up solrcloud for
    failover and load balance.   There are three zookeeper servers (zk1:2181,
    zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the
    solr url should the client to use for access?  Will it be solr1:8983, the
    leader?
   
    If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?
    Will the request be routed to solr2:8983 via the zookeeper?  I understand
    that zookeeper is doing all the coordination works but wanted to understand
    how this works.
   
    Any insight would be greatly appreciated.
    Steve
   
   
   
Reply | Threaded
Open this post in threaded view
|

RE: how to access solr in solrcloud

Gu, Steve (CDC/CDC OD/OADS) (CTR)
In reply to this post by Walter Underwood
Thanks, Walter

-----Original Message-----
From: Walter Underwood <[hidden email]>
Sent: Wednesday, September 12, 2018 10:41 AM
To: [hidden email]
Subject: Re: how to access solr in solrcloud

Use a load balancer. It doesn’t have to be fancy, we use the Amazon ALB because our clusters are in AWS.

Zookeeper never handles queries. It coordinates cluster changes with the Solr instances.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Sep 12, 2018, at 6:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) <[hidden email]> wrote:
>
> Hi, all
>
> I am upgrading our solr to 7.4 and would like to set up solrcloud for failover and load balance.   There are three zookeeper servers (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the solr url should the client to use for access?  Will it be solr1:8983, the leader?
>
> If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?  Will the request be routed to solr2:8983 via the zookeeper?  I understand that zookeeper is doing all the coordination works but wanted to understand how this works.
>
> Any insight would be greatly appreciated.
> Steve
>

Reply | Threaded
Open this post in threaded view
|

RE: how to access solr in solrcloud

Gu, Steve (CDC/CDC OD/OADS) (CTR)
In reply to this post by Вадим Иванов
Vadim,

That makes perfect sense.

Thanks
Steve

-----Original Message-----
From: Vadim Ivanov <[hidden email]>
Sent: Wednesday, September 12, 2018 10:23 AM
To: [hidden email]
Subject: RE: how to access solr in solrcloud

Hi,  Steve
If you are using  solr1:8983 to access solr and solr1 is down IMHO nothing helps you to access dead ip.
You should switch to any other live node in the cluster or I'd propose to have nginx as frontend to access Solrcloud.

--
BR, Vadim



-----Original Message-----
From: Gu, Steve (CDC/OD/OADS) (CTR) [mailto:[hidden email]]
Sent: Wednesday, September 12, 2018 4:38 PM
To: '[hidden email]'
Subject: how to access solr in solrcloud

Hi, all

I am upgrading our solr to 7.4 and would like to set up solrcloud for
failover and load balance.   There are three zookeeper servers (zk1:2181,
zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the
solr url should the client to use for access?  Will it be solr1:8983, the
leader?

If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?
Will the request be routed to solr2:8983 via the zookeeper?  I understand
that zookeeper is doing all the coordination works but wanted to understand
how this works.

Any insight would be greatly appreciated.
Steve



Reply | Threaded
Open this post in threaded view
|

Re: how to access solr in solrcloud

Shawn Heisey-2
In reply to this post by Gu, Steve (CDC/CDC OD/OADS) (CTR)
On 9/12/2018 7:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote:
> I am upgrading our solr to 7.4 and would like to set up solrcloud for failover and load balance.   There are three zookeeper servers (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983.  So what will be the solr url should the client to use for access?  Will it be solr1:8983, the leader?
>
> If we  use solr1:8983 to access solr, what happens if solr1:8983 is down?  Will the request be routed to solr2:8983 via the zookeeper?  I understand that zookeeper is doing all the coordination works but wanted to understand how this works.

Zookeeper does not handle Solr requests.  It doesn't know anything at
all about Solr.  It is Solr that uses ZK to coordinate the cluster.

If you are using the Java client called CloudSolrClient, then you will
most likely be informing it about ZK, not Solr, and it will
automatically determine what Solr servers there are by talking to ZK,
and then will talk directly to the correct Solr servers.  If you are not
using a client that is ZK-aware, then you will need a load balancer
sitting in front of your Solr servers. Don't put a load balancer in
front of ZooKeeper.  Your clients will then talk to the load balancer.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: how to access solr in solrcloud

Florian Gleixner
On 9/12/18 8:21 PM, Shawn Heisey wrote:

> On 9/12/2018 7:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote:
>> I am upgrading our solr to 7.4 and would like to set up solrcloud for
>> failover and load balance.   There are three zookeeper servers
>> (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983.  So
>> what will be the solr url should the client to use for access?  Will
>> it be solr1:8983, the leader?
>>
>> If we  use solr1:8983 to access solr, what happens if solr1:8983 is
>> down?  Will the request be routed to solr2:8983 via the zookeeper?  I
>> understand that zookeeper is doing all the coordination works but
>> wanted to understand how this works.
>
> Zookeeper does not handle Solr requests.  It doesn't know anything at
> all about Solr.  It is Solr that uses ZK to coordinate the cluster.
>
> If you are using the Java client called CloudSolrClient, then you will
> most likely be informing it about ZK, not Solr, and it will
> automatically determine what Solr servers there are by talking to ZK,
> and then will talk directly to the correct Solr servers.  If you are not
> using a client that is ZK-aware, then you will need a load balancer
> sitting in front of your Solr servers. Don't put a load balancer in
> front of ZooKeeper.  Your clients will then talk to the load balancer.
The advantage over haproxy/nginx/... solutions is, that a client, that
is using zookeeper, registers at zookeeper and in case a solr node goes
down, the solr node may inform zookeeper, which may inform all
registered clients. Failover can be much faster with CloudSolrClient
than with haproxy or similar solutions.
And CloudSolrClient knows which is the leader and when indexing, it
routes documents to the leader which avoids overhead.
I've written a SolrCloudProxy which can be used to connect non-cloud
aware clients to a solr cloud. The proxy uses CloudSolrClient with all
its advantages. It is not yet production ready, but you may want to try it:
https://gitlab.lrz.de/a2814ad/SolrCloudProxy




signature.asc (188 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: how to access solr in solrcloud

Gu, Steve (CDC/CDC OD/OADS) (CTR)
Hi, Florian

We need to pass zookeeper url to CloudSolrClient.  Since there are multiple zk servers, is it the common practice to set a proxy server in front of zookeeper?

Thanks for your advice.
Steve

-----Original Message-----
From: Florian Gleixner <[hidden email]>
Sent: Wednesday, September 12, 2018 6:27 PM
To: [hidden email]
Subject: Re: how to access solr in solrcloud

On 9/12/18 8:21 PM, Shawn Heisey wrote:

> On 9/12/2018 7:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote:
>> I am upgrading our solr to 7.4 and would like to set up solrcloud for
>> failover and load balance.   There are three zookeeper servers
>> (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983. 
>> So what will be the solr url should the client to use for access? 
>> Will it be solr1:8983, the leader?
>>
>> If we  use solr1:8983 to access solr, what happens if solr1:8983 is
>> down?  Will the request be routed to solr2:8983 via the zookeeper?  I
>> understand that zookeeper is doing all the coordination works but
>> wanted to understand how this works.
>
> Zookeeper does not handle Solr requests.  It doesn't know anything at
> all about Solr.  It is Solr that uses ZK to coordinate the cluster.
>
> If you are using the Java client called CloudSolrClient, then you will
> most likely be informing it about ZK, not Solr, and it will
> automatically determine what Solr servers there are by talking to ZK,
> and then will talk directly to the correct Solr servers.  If you are
> not using a client that is ZK-aware, then you will need a load
> balancer sitting in front of your Solr servers. Don't put a load
> balancer in front of ZooKeeper.  Your clients will then talk to the load balancer.

The advantage over haproxy/nginx/... solutions is, that a client, that is using zookeeper, registers at zookeeper and in case a solr node goes down, the solr node may inform zookeeper, which may inform all registered clients. Failover can be much faster with CloudSolrClient than with haproxy or similar solutions.
And CloudSolrClient knows which is the leader and when indexing, it routes documents to the leader which avoids overhead.
I've written a SolrCloudProxy which can be used to connect non-cloud aware clients to a solr cloud. The proxy uses CloudSolrClient with all its advantages. It is not yet production ready, but you may want to try it:
https://gitlab.lrz.de/a2814ad/SolrCloudProxy



Reply | Threaded
Open this post in threaded view
|

Re: how to access solr in solrcloud

Shawn Heisey-2
On 9/13/2018 6:23 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote:
> Hi, Florian
>
> We need to pass zookeeper url to CloudSolrClient.  Since there are multiple zk servers, is it the common practice to set a proxy server in front of zookeeper?

ZooKeeper should not be behind a load balancer.  Your client should have
all of the actual addresses/ports of your ZK servers.  ZK high
availability doesn't work right if a given address/port can suddenly
change to a different server.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: how to access solr in solrcloud

Walter Underwood
In reply to this post by Gu, Steve (CDC/CDC OD/OADS) (CTR)
Use direct connections to Zookeeper. Using a load balancer or proxy is not recommended.

Zookeeper needs direct TCP connections. It is not an HTTP server.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Sep 13, 2018, at 5:23 AM, Gu, Steve (CDC/OD/OADS) (CTR) <[hidden email]> wrote:
>
> Hi, Florian
>
> We need to pass zookeeper url to CloudSolrClient.  Since there are multiple zk servers, is it the common practice to set a proxy server in front of zookeeper?
>
> Thanks for your advice.
> Steve
>
> -----Original Message-----
> From: Florian Gleixner <[hidden email]>
> Sent: Wednesday, September 12, 2018 6:27 PM
> To: [hidden email]
> Subject: Re: how to access solr in solrcloud
>
> On 9/12/18 8:21 PM, Shawn Heisey wrote:
>> On 9/12/2018 7:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote:
>>> I am upgrading our solr to 7.4 and would like to set up solrcloud for
>>> failover and load balance.   There are three zookeeper servers
>>> (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983.  
>>> So what will be the solr url should the client to use for access?  
>>> Will it be solr1:8983, the leader?
>>>
>>> If we  use solr1:8983 to access solr, what happens if solr1:8983 is
>>> down?  Will the request be routed to solr2:8983 via the zookeeper?  I
>>> understand that zookeeper is doing all the coordination works but
>>> wanted to understand how this works.
>>
>> Zookeeper does not handle Solr requests.  It doesn't know anything at
>> all about Solr.  It is Solr that uses ZK to coordinate the cluster.
>>
>> If you are using the Java client called CloudSolrClient, then you will
>> most likely be informing it about ZK, not Solr, and it will
>> automatically determine what Solr servers there are by talking to ZK,
>> and then will talk directly to the correct Solr servers.  If you are
>> not using a client that is ZK-aware, then you will need a load
>> balancer sitting in front of your Solr servers. Don't put a load
>> balancer in front of ZooKeeper.  Your clients will then talk to the load balancer.
>
> The advantage over haproxy/nginx/... solutions is, that a client, that is using zookeeper, registers at zookeeper and in case a solr node goes down, the solr node may inform zookeeper, which may inform all registered clients. Failover can be much faster with CloudSolrClient than with haproxy or similar solutions.
> And CloudSolrClient knows which is the leader and when indexing, it routes documents to the leader which avoids overhead.
> I've written a SolrCloudProxy which can be used to connect non-cloud aware clients to a solr cloud. The proxy uses CloudSolrClient with all its advantages. It is not yet production ready, but you may want to try it:
> https://gitlab.lrz.de/a2814ad/SolrCloudProxy
>
>
>

Reply | Threaded
Open this post in threaded view
|

RE: how to access solr in solrcloud

Gu, Steve (CDC/CDC OD/OADS) (CTR)
That makes sense.  CloudSolrClient accepts a list of zookeepers.  It works beautifully.

Thanks
Steve

-----Original Message-----
From: Walter Underwood <[hidden email]>
Sent: Thursday, September 13, 2018 10:55 AM
To: [hidden email]
Subject: Re: how to access solr in solrcloud

Use direct connections to Zookeeper. Using a load balancer or proxy is not recommended.

Zookeeper needs direct TCP connections. It is not an HTTP server.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Sep 13, 2018, at 5:23 AM, Gu, Steve (CDC/OD/OADS) (CTR) <[hidden email]> wrote:
>
> Hi, Florian
>
> We need to pass zookeeper url to CloudSolrClient.  Since there are multiple zk servers, is it the common practice to set a proxy server in front of zookeeper?
>
> Thanks for your advice.
> Steve
>
> -----Original Message-----
> From: Florian Gleixner <[hidden email]>
> Sent: Wednesday, September 12, 2018 6:27 PM
> To: [hidden email]
> Subject: Re: how to access solr in solrcloud
>
> On 9/12/18 8:21 PM, Shawn Heisey wrote:
>> On 9/12/2018 7:38 AM, Gu, Steve (CDC/OD/OADS) (CTR) wrote:
>>> I am upgrading our solr to 7.4 and would like to set up solrcloud for
>>> failover and load balance.   There are three zookeeper servers
>>> (zk1:2181, zk1:2182) and two solr instance solr1:8983, solr2:8983.  
>>> So what will be the solr url should the client to use for access?  
>>> Will it be solr1:8983, the leader?
>>>
>>> If we  use solr1:8983 to access solr, what happens if solr1:8983 is
>>> down?  Will the request be routed to solr2:8983 via the zookeeper?  
>>> I understand that zookeeper is doing all the coordination works but
>>> wanted to understand how this works.
>>
>> Zookeeper does not handle Solr requests.  It doesn't know anything at
>> all about Solr.  It is Solr that uses ZK to coordinate the cluster.
>>
>> If you are using the Java client called CloudSolrClient, then you
>> will most likely be informing it about ZK, not Solr, and it will
>> automatically determine what Solr servers there are by talking to ZK,
>> and then will talk directly to the correct Solr servers.  If you are
>> not using a client that is ZK-aware, then you will need a load
>> balancer sitting in front of your Solr servers. Don't put a load
>> balancer in front of ZooKeeper.  Your clients will then talk to the load balancer.
>
> The advantage over haproxy/nginx/... solutions is, that a client, that is using zookeeper, registers at zookeeper and in case a solr node goes down, the solr node may inform zookeeper, which may inform all registered clients. Failover can be much faster with CloudSolrClient than with haproxy or similar solutions.
> And CloudSolrClient knows which is the leader and when indexing, it routes documents to the leader which avoids overhead.
> I've written a SolrCloudProxy which can be used to connect non-cloud aware clients to a solr cloud. The proxy uses CloudSolrClient with all its advantages. It is not yet production ready, but you may want to try it:
> https://gitlab.lrz.de/a2814ad/SolrCloudProxy
>
>
>