Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> 15 minutes to sync?


+
Jordan Zimmerman 2012-07-31, 21:22
+
Jordan Zimmerman 2012-07-31, 21:23
Copy link to this message
-
Re: 15 minutes to sync?
You have an 11gig snapshot file. That's very large. Did someone
unexpectedly overload the server with znode creations?

When a follower comes up the leader needs to serialize the znodes to
the snapshot file, stream it to the follower, who saves it locally
then deserializes it. (11g/15min is avg about 12meg/second for this
process)

Often times this is exacerbated by the max heap and GC interactions.

Patrick

On Tue, Jul 31, 2012 at 2:23 PM, Jordan Zimmerman
<[EMAIL PROTECTED]> wrote:
> BTW - this is 3.3.5
>
> On Jul 31, 2012, at 2:22 PM, Jordan Zimmerman <[EMAIL PROTECTED]> wrote:
>
>> We've had a few outages of our ZK cluster recently. When trying to bring the cluster back up it's been taking 10-15 minutes for the followers to sync with the Leader. Any idea what might cause this? Here's an ls of the data dir:
>>
>> -rw-r--r-- 1 zookeeperserverprod nac    67108880 Jul 31 20:39 log.3900a4bc75
>> -rw-r--r-- 1 zookeeperserverprod nac    67108880 Jul 31 20:40 log.3900a634ee
>> -rw-r--r-- 1 zookeeperserverprod nac    67108880 Jul 31 21:21 log.3a00000001
>> -rw-r--r-- 1 zookeeperserverprod nac    67108880 Jul 31 21:22 log.3a000139a2
>> -rw-r--r-- 1 zookeeperserverprod nac  9279729723 Jul 31 20:42 snapshot.3900a634ec
>> -rw-r--r-- 1 zookeeperserverprod nac 11126306780 Jul 31 21:09 snapshot.3900a6b149
>> -rw-r--r-- 1 zookeeperserverprod nac  4153727423 Jul 31 21:22 snapshot.3a000139a0
>>
>
+
Jordan Zimmerman 2012-07-31, 22:14
+
Patrick Hunt 2012-07-31, 22:17
+
Jordan Zimmerman 2012-07-31, 22:34
+
Patrick Hunt 2012-08-01, 00:13
+
Jordan Zimmerman 2012-08-01, 00:46
+
David Nickerson 2012-08-02, 23:57