Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Query regarding zookeeper


Copy link to this message
-
Query regarding zookeeper
Hi,

I have 5 box zookeeper quorum. 2 / 5 are unable to deserialize the message from leader and hence they are not coming up. I tried restarting few times but still seeing the same issue. This issue is observed when we were trying to do the following:
1>   We have a script which brings down the zookeeper server on the leader. We are trying to test if new follower takes over once leader is down.

2>   The script has been running since 2-3 days. Surprisingly we don't see any issue for the first 2 days but after that we hit the below issue wherein 2 out 5 zookeeper machines are unable to deserialize the message.

zookeeper server = 3.4.5

Following are the zookeeper settings:

tickTime=4000
initLimit=20
syncLimit=15
snapCount=100000

Logs from Follower

---------------

Thu Mar 28 19:14:14 2013: 2013-03-28 19:14:14,158 - INFO  [WorkerReceiver
Thread:FastLeaderElection@496] - Notification: 2 (n.leader), 0 (n.zxid), 3
(n.round), LOOKING (n.state), 2 (n.sid), FOLLOWING (my state)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,811 - WARN
[QuorumPeer:/0.0.0.0:2181:Follower@82] - Exception when following the leader
Thu Mar 28 19:14:41 2013: java.io.EOFException
Thu Mar 28 19:14:41 2013:       at
java.io.DataInputStream.readInt(DataInputStream.java:392)
Thu Mar 28 19:14:41 2013:       at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
Thu Mar 28 19:14:41 2013:       at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84)
Thu Mar 28 19:14:41 2013:       at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
Thu Mar 28 19:14:41 2013:       at
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:148)
Thu Mar 28 19:14:41 2013:       at
org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:332)
Thu Mar 28 19:14:41 2013:       at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75)
Thu Mar 28 19:14:41 2013:       at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:645)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,811 - INFO
[QuorumPeer:/0.0.0.0:2181:Follower@165] - shutdown called
Thu Mar 28 19:14:41 2013: java.lang.Exception: shutdown Follower
Thu Mar 28 19:14:41 2013:       at
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:165)
Thu Mar 28 19:14:41 2013:       at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:649)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,812 - INFO
[QuorumPeer:/0.0.0.0:2181:QuorumPeer@621] - LOOKING
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,812 - INFO
[QuorumPeer:/0.0.0.0:2181:FastLeaderElection@663] - New election. My id =  1,
Proposed zxid = 0
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,813 - INFO  [WorkerReceiver
Thread:FastLeaderElection@496] - Notification: 1 (n.leader), 0 (n.zxid), 3
(n.round), LOOKING (n.state), 1 (n.sid), LOOKING (my state)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,814 - INFO  [WorkerReceiver
Thread:FastLeaderElection@496] - Notification: 4 (n.leader), 85899346051
(n.zxid), 2 (n.round), FOLLOWING (n.state), 3 (n.sid), LOOKING (my state)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,814 - INFO  [WorkerReceiver
Thread:FastLeaderElection@496] - Notification: 4 (n.leader), 85899346051
(n.zxid), 2 (n.round), FOLLOWING (n.state), 2 (n.sid), LOOKING (my state)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,814 - INFO  [WorkerReceiver
Thread:FastLeaderElection@496] - Notification: 4 (n.leader), 85899346051
(n.zxid), 2 (n.round), LEADING (n.state), 4 (n.sid), LOOKING (my state)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,814 - INFO  [WorkerReceiver
Thread:FastLeaderElection@496] - Notification: 4 (n.leader), 85899346051
(n.zxid), 2 (n.round), FOLLOWING (n.state), 5 (n.sid), FOLLOWING (my state)
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,814 - INFO
[QuorumPeer:/0.0.0.0:2181:QuorumPeer@643] - FOLLOWING
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,815 - INFO
[QuorumPeer:/0.0.0.0:2181:ZooKeeperServer@151] - Created server with tickTime
2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir
/home/y/var/zookeeper/version-2 snapdir /home/y/var/zookeeper/version-2
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,817 - INFO
[QuorumPeer:/0.0.0.0:2181:Learner@294] - Getting a snapshot from leader
Thu Mar 28 19:14:41 2013: 2013-03-28 19:14:41,819 - INFO
[QuorumPeer:/0.0.0.0:2181:Learner@325] - Setting leader epoch 15
Thu Mar 28 19:14:54 2013: 2013-03-28 19:14:54,208 - INFO  [WorkerReceiver
Thread:FastLeaderElection@496] - Notification: 2 (n.leader), 0 (n.zxid), 3
(n.round), LOOKING (n.state), 2 (n.sid), FOLLOWING (my state)
Logs from Leader
Thu Mar 28 19:18:14 2013: 2013-03-28 19:18:14,416 - ERROR
[LearnerHandler-/98.139.207.73:37505:LearnerHandler@562] - Unexpected exception
causing shutdown while sock still open
Thu Mar 28 19:18:14 2013: java.net.SocketTimeoutException: Read timed out
Thu Mar 28 19:18:14 2013:       at
java.net.SocketInputStream.socketRead0(Native Method)
Thu Mar 28 19:18:14 2013:       at
java.net.SocketInputStream.read(SocketInputStream.java:150)
Thu Mar 28 19:18:14 2013:       at
java.net.SocketInputStream.read(SocketInputStream.java:121)
Thu Mar 28 19:18:14 2013:       at
java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
Thu Mar 28 19:18:14 2013:       at
java.io.BufferedInputStream.read(BufferedInputStream.java:254)
Thu Mar 28 19:18:14 2013:       at
java.io.DataInputStream.readInt(DataInputStream.java:387)
Thu Mar 28 19:18:14 2013:       at
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
Thu Mar 28 19:18:14 2013:       at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
Thu Mar 28 19:18:14 2013:       at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
Thu Mar 28 19:18:14 2013:       at
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:450)
Thu Mar 28 19:18:14 2013: 2013-03-28 19:18:14,417 - WARN
[LearnerHandler-/98.139.207.73:37505
+
Kathleen Ting 2013-04-04, 05:39
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB