Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Leader election failure


+
Jared Cantwell 2012-07-27, 04:40
+
Hanno Schlichting 2012-07-27, 05:11
Copy link to this message
-
Re: Leader election failure
We are currently testing out 3.5.0.  If the fix made it into 3.4.4, I
assume that issue is also fixed in 3.5.0?

~Jared

On Thu, Jul 26, 2012 at 11:11 PM, Hanno Schlichting <[EMAIL PROTECTED]>wrote:

> What version of ZK are yoy using? There's a bug in 3.4.x with 5 node
> clusters failing to agree on a leader. That's only solved in the yet
> unreleased 3.4.4.
>
> Hanno
>
> On 27.07.2012, at 06:40, Jared Cantwell <[EMAIL PROTECTED]> wrote:
>
> > I have a 5 node cluster configured using dynamic zookeeper.  It has been
> > through several reconfigurations, but at the moment I am simply trying to
> > start 3 of the nodes to get ZK accessible.  I have confirmed that the
> myid
> > files match the entries in the dynamic membership file for the 3 nodes in
> > question.  However, when I start up the three nodes I get the following
> > error:
> >
> > 2012-07-26 22:26:01,037 [myid:8] - INFO  [QuorumPeer[myid=8]/
> 10.10.5.27:2181
> > :Leader@445] - LEADING - LEADER ELECTION TOOK - 13
> > 2012-07-26 22:26:01,039 [myid:8] - INFO  [QuorumPeer[myid=8]/
> 10.10.5.27:2181
> > :FileSnap@83] - Reading snapshot /sf/data/zookeeper/
> > 10.10.5.27/version-2/snapshot.3000001e3
> > 2012-07-26 22:26:01,065 [myid:8] - INFO  [QuorumPeer[myid=8]/
> 10.10.5.27:2181
> > :FileTxnSnapLog@270] - Snapshotting: 0x3000001e3 to /sf/data/zookeeper/
> > 10.10.5.27/version-2/snapshot.3000001e3
> > 2012-07-26 22:26:10,837 [myid:8] - INFO
> > [WorkerReceiver[myid=8]:FastLeaderElection@635] - Notification: 8
> > (n.leader), 0x3000001e3 (n.zxid), 0x1 (n.round), LOOKING (n.state), 9
> > (n.sid), 0x3 (n.peerEPoch), LEADING (my state)300000147 (n.config
> version)
> > 2012-07-26 22:26:20,849 [myid:8] - INFO
> > [WorkerReceiver[myid=8]:FastLeaderElection@635] - Notification: 8
> > (n.leader), 0x3000001e3 (n.zxid), 0x1 (n.round), LOOKING (n.state), 9
> > (n.sid), 0x3 (n.peerEPoch), LEADING (my state)300000147 (n.config
> version)
> > 2012-07-26 22:26:21,083 [myid:8] - WARN  [QuorumPeer[myid=8]/
> 10.10.5.27:2181
> > :QuorumPeer@949] - Unexpected exception
> > java.lang.InterruptedException: *Timeout while waiting for epoch from
> quorum
> > *
> >        at
> >
> org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1207)
> >        at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:464)
> >        at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:946)
> > 2012-07-26 22:26:21,083 [myid:8] - INFO  [QuorumPeer[myid=8]/
> 10.10.5.27:2181
> > :Leader@614] - Shutting down
> > 2012-07-26 22:26:21,083 [myid:8] - INFO  [QuorumPeer[myid=8]/
> 10.10.5.27:2181
> > :Leader@620] - Shutdown called
> > java.lang.Exception: shutdown Leader! reason: Forcing shutdown
> >        at
> > org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:620)
> >        at
> > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:952)
> > 2012-07-26 22:26:21,084 [myid:8] - INFO  [QuorumPeer[myid=8]/
> 10.10.5.27:2181
> > :ZooKeeperServer@413] - shutting down
> > 2012-07-26 22:26:21,084 [myid:8] - INFO
> > [LearnerCnxAcceptor-0.0.0.0/0.0.0.0:2182:Leader$LearnerCnxAcceptor@407]
> -
> > exception while shutting down acceptor: java.net.SocketException: Socket
> > closed
> >
> > I am not sure what to make of it or how to debug from here.  Any pointers
> > or suggestions on how to debug what might be wrong, or simply some usual
> > causes of this error would be appreciated.
> >
> > Thanks!
> > Jared
>
+
Jared Cantwell 2012-07-27, 14:54
+
David Nickerson 2012-07-27, 17:01
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB