Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Zookeeper delay  to reconnect


Copy link to this message
-
Zookeeper delay  to reconnect
Hi,
Zookeeper implements a delay of up to 1 second before trying to reconnect.

ClientCnxn$SendThread
         @Override
         public void run() {
             ...
             while (state.isAlive()) {
                 try {
                     if (!clientCnxnSocket.isConnected()) {
                         if(!isFirstConnect){
                             try {
                                 Thread.sleep(r.nextInt(1000));
                             } catch (InterruptedException e) {
                                 LOG.warn("Unexpected exception", e);
                             }

This creates "outages" (even with simple retry on ConnectionLoss) up to
1s even with perfectly healthy cluster like in scenario of rolling
restart. In our scenario it might be a problem under high load creating
a spike in a number of requests waiting on zk operation.
Would it be a better strategy to perform reconnect attempt immediately
at least one time? Or there is more to it?

Regards,
Sergei

This e-mail message and all attachments transmitted with it may contain privileged and/or confidential information intended solely for the use of the addressee(s). If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, forwarding or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete this message, all attachments and all copies and backups thereof.