Create too many zookeeper connections when recreate CloudSolrServer instance

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Create too many zookeeper connections when recreate CloudSolrServer instance

wg85907
Hi Community,
        I use Solr(4.10.2) as indexing tool. I use a singleton CloudSolrServer instance to query Solr. When meet exception, for example current Solr server not response, i will create a new CloudSolrServer instance and shutdown the old one. We have many query threads that share the same CloudSolrServer instance. In a case, when thread A meet an Exception it create a new CloudSolrServer instance and begin to shutdown current CloudSolrServer, from Solr code I know the first step is to close the Zookeeper connection; while at the same time, thread B may still doing query with this instance, the first step of query is to check Zookeeper connection, if the connection is not exist, then create one. Thread A can processed to do the shutdown. Then the Zookeeper connection created by thread B is over there without access. Due to this, we may have more and more zookeeper connections at the same time till we can't create one new and get below exception on zookeeper server side:                                                                                        2017-07-06 09:42:37,595 [myid:5] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:10199:NIOServerCnxnFactory@193] - Too many connections from /169.171.87.37 - max is 60
      So I just want to know if I operate CloudSolrServer in a wrong way and do you have any suggestions about how to fill my requirement.
Regards,
Geng, Wei
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

Shawn Heisey-2
On 7/14/2017 6:29 AM, wg85907 wrote:
>         I use Solr(4.10.2) as indexing tool. I use a singleton
> CloudSolrServer instance to query Solr. When meet exception, for example
> current Solr server not response, i will create a new CloudSolrServer
> instance and shutdown the old one.

Why shutdown the object and create a new one?  That should not be necessary.

The SolrJ client objects, including CloudSolrServer (CloudSolrClient in
5.0 and later) are designed to be created once and used by multiple
threads until program exit.  Any problems you encounter with them should
be due to either an incorrect query or server side problems ... if you
are finding that the client stops working after encountering an error
and never starts working again, that's either a problem with your system
or a bug in SolrJ.  A problem with your system is more likely than a
bug, but a bug is always possible.

Since you're using CloudSolrServer and not CloudSolrClient, I am
assuming that your SolrJ version is also 4.x.  Upgrading is strongly
recommended.  Here are a few things you should know about why I am
making that recommendation:  Development on 4.x is completely dead since
the release of 6.0 in April 2016 -- any bug found in 4.x will NOT be
fixed.  The 5.x branch is in maintenance mode, which means that only
very major bugs will be fixed.  The current stable branch is 6.x -- the
vast majority of problems will only be fixed there.  The project is in
the process of gearing up for a 7.0 release, which will end development
on 5.x, put 6.x in maintenance mode, and make 7.x the stable branch.

> 2017-07-06 09:42:37,595 [myid:5] - WARN
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:10199:NIOServerCnxnFactory@193] - Too many connections from /169.171.87.37 - max is 60
>       So I just want to know if I operate CloudSolrServer in a wrong way and do you have any suggestions about how to fill my requirement.

You should not be creating new CloudSolrServer instances.  That
exception may indicate that there is a bug in the shutdown method, but
as already said, you should not need to make a new client object.  Note
that a bug in the SolrServer#shutdown method in 4.x or 5.x isn't going
to be fixed.  In 6.x, the shutdown method is gone and the SolrClient
object now uses a close() method.  If 6.x has a problem with close(),
that is something we need to know.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

wg85907
Hi Shawn,
        Thanks for your detail explanation. The reason I want to shutdown the CloudSolrServer instance and create a new one is that I have concern that if it can successfully reconnect to Zookeeper server if Zookeeper cluster has some issue and reboot. I will do related test with version 6.5.0, which is the version I want to upgrade to. If there is any issue, I will report the issue to you and your team as you suggested. Anyway I will abandon the way that shutdown/close the CloudSolrServer instance and create a new one. The alternative opinion is to manage Zookeeper connection myself by extending Class ZkClientClusterStateProvider.
Regards,
Geng, Wei
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

Walter Underwood
If your Zookeeper cluster is rebooting frequently, you have much, much worse problems than client connections.

Is Zookeeper unstable in your installation? If so, fix that.

Stop hacking the client.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)


> On Jul 17, 2017, at 1:48 AM, wg85907 <[hidden email]> wrote:
>
> Hi Shawn,
>        Thanks for your detail explanation. The reason I want to shutdown
> the CloudSolrServer instance and create a new one is that I have concern
> that if it can successfully reconnect to Zookeeper server if Zookeeper
> cluster has some issue and reboot. I will do related test with version
> 6.5.0, which is the version I want to upgrade to. If there is any issue, I
> will report the issue to you and your team as you suggested. Anyway I will
> abandon the way that shutdown/close the CloudSolrServer instance and create
> a new one. The alternative opinion is to manage Zookeeper connection myself
> by extending Class ZkClientClusterStateProvider.
> Regards,
> Geng, Wei
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040p4346295.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

wg85907
I am not mean my Zookeeper cluster is rebooting frequently, just want to ensure my query service can be stable when Zookeeper cluster has issue or reboot. Will do some test to check if there is some issue here. Maybe current Zookeeper client can handle this case well. Hacking the client will always be the last choice.
Regards,
Geng, Wei
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

Shawn Heisey-2
In reply to this post by wg85907
On 7/17/2017 2:48 AM, wg85907 wrote:
>         Thanks for your detail explanation. The reason I want to shutdown the CloudSolrServer instance and create a new one is that I have concern that if it can successfully reconnect to Zookeeper server if Zookeeper cluster has some issue and reboot.

I know that as long as the zookeeper ensemble (which is three or more ZK
servers working together) does not lose quorum, and Solr is connected to
all of the servers in the ensemble, Solr will be fine.

I have heard someone on the list say that if ZK loses quorum (which
means that the number of running servers drops below a required minimum)
then Solr doesn't recover correctly when quorum is re-established.  If
you have three servers, then at least two of them must be working to
maintain quorum.  If there are five servers, then at least three of them
must be working.

I do not think that the problem described above has been confirmed as an
issue.  If it does turn out to be true, then the problem is not likely
to be in Solr, but in ZK -- Solr uses the ZK client, which completely
manages that communication.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

Walter Underwood
In reply to this post by wg85907
The entire point of a Zookeeper cluster is that it continues to be available when one (or more) nodes are down.

If you want more failure tolerance, run a five node Zookeeper cluster instead of a three node cluster.

Hacking the client will not increase robustness. Right now, you are hurting robustness by being too clever with the client.

Hacking the client is not a last choice, it is a bad choice.

For queries, there is not much benefit in running the cloud-aware client. A regular load balancer works just about as well. We use the Amazon load balancers.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)


> On Jul 18, 2017, at 3:25 AM, wg85907 <[hidden email]> wrote:
>
> I am not mean my Zookeeper cluster is rebooting frequently, just want to
> ensure my query service can be stable when Zookeeper cluster has issue or
> reboot. Will do some test to check if there is some issue here. Maybe
> current Zookeeper client can handle this case well. Hacking the client will
> always be the last choice.
> Regards,
> Geng, Wei
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040p4346528.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

wg85907
Hi Walter, Shawn,
        Thanks for your quickly reply, the information you provide is really helpful. Now I know how to find a right way to resolve my issue.
Regards,
Geng, Wei
Loading...