Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Leader election failure


Copy link to this message
-
Leader election failure
I have a 5 node cluster configured using dynamic zookeeper.  It has been
through several reconfigurations, but at the moment I am simply trying to
start 3 of the nodes to get ZK accessible.  I have confirmed that the myid
files match the entries in the dynamic membership file for the 3 nodes in
question.  However, when I start up the three nodes I get the following
error:

2012-07-26 22:26:01,037 [myid:8] - INFO  [QuorumPeer[myid=8]/10.10.5.27:2181
:Leader@445] - LEADING - LEADER ELECTION TOOK - 13
2012-07-26 22:26:01,039 [myid:8] - INFO  [QuorumPeer[myid=8]/10.10.5.27:2181
:FileSnap@83] - Reading snapshot /sf/data/zookeeper/
10.10.5.27/version-2/snapshot.3000001e3
2012-07-26 22:26:01,065 [myid:8] - INFO  [QuorumPeer[myid=8]/10.10.5.27:2181
:FileTxnSnapLog@270] - Snapshotting: 0x3000001e3 to /sf/data/zookeeper/
10.10.5.27/version-2/snapshot.3000001e3
2012-07-26 22:26:10,837 [myid:8] - INFO
 [WorkerReceiver[myid=8]:FastLeaderElection@635] - Notification: 8
(n.leader), 0x3000001e3 (n.zxid), 0x1 (n.round), LOOKING (n.state), 9
(n.sid), 0x3 (n.peerEPoch), LEADING (my state)300000147 (n.config version)
2012-07-26 22:26:20,849 [myid:8] - INFO
 [WorkerReceiver[myid=8]:FastLeaderElection@635] - Notification: 8
(n.leader), 0x3000001e3 (n.zxid), 0x1 (n.round), LOOKING (n.state), 9
(n.sid), 0x3 (n.peerEPoch), LEADING (my state)300000147 (n.config version)
2012-07-26 22:26:21,083 [myid:8] - WARN  [QuorumPeer[myid=8]/10.10.5.27:2181
:QuorumPeer@949] - Unexpected exception
java.lang.InterruptedException: *Timeout while waiting for epoch from quorum
*
        at
org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1207)
        at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:464)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:946)
2012-07-26 22:26:21,083 [myid:8] - INFO  [QuorumPeer[myid=8]/10.10.5.27:2181
:Leader@614] - Shutting down
2012-07-26 22:26:21,083 [myid:8] - INFO  [QuorumPeer[myid=8]/10.10.5.27:2181
:Leader@620] - Shutdown called
java.lang.Exception: shutdown Leader! reason: Forcing shutdown
        at
org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:620)
        at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:952)
2012-07-26 22:26:21,084 [myid:8] - INFO  [QuorumPeer[myid=8]/10.10.5.27:2181
:ZooKeeperServer@413] - shutting down
2012-07-26 22:26:21,084 [myid:8] - INFO
 [LearnerCnxAcceptor-0.0.0.0/0.0.0.0:2182:Leader$LearnerCnxAcceptor@407] -
exception while shutting down acceptor: java.net.SocketException: Socket
closed

I am not sure what to make of it or how to debug from here.  Any pointers
or suggestions on how to debug what might be wrong, or simply some usual
causes of this error would be appreciated.

Thanks!
Jared