Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> zookeeper quorum falling apart with continuous leader election


Copy link to this message
-
zookeeper quorum falling apart with continuous leader election
Hi ,

I have 3 node zookeeper 3.5.0.1458648 quorum on my setup.
We came across a situation where one of the zk server in the cluster went
down
due to bad disk.
We observed that leader election keeps running in loop (starts, completes
and again starts). The loop repeats every couple of minutes.
Even restarting zookeeper server on both nodes doesn't help recovering from
this loop.
Network connection looks fine though, as I could telnet leader election
port and ssh from one node to other.
zookeeper client on each node is using "127.0.0.1:2181" as quorum string
for connecting to server, therefore if local zookeeper server is down
client app is dead.

I have uploaded zookeeper.log for both nodes at following link:
https://dl.dropboxusercontent.com/u/36429721/zkSupportLog.tar.gz

Any idea what might be wrong with the quorum? Please note that restarting
zookeeper server on both nodes doesn't help to recover from this situations.

Thanks & Regards,
Deepak