Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> leap second excitement


Copy link to this message
-
Re: leap second excitement
Thanks for the report Scott, from what I've seen so far this seems to
be a Linux bug and not specific to java/ZK, here are a couple of the
more informative link's I've seen:
http://hackerne.ws/item?id=4188412
http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix

Anyone have specific insight into how this expressed itself in java?
I've seen some references to futex being the root (from java
perspective) "It's a critical Linux bug that causes futex to timeout,
and anything that uses it to behave incorrectly."

Patrick

On Sun, Jul 1, 2012 at 2:58 PM, Scott Fines <[EMAIL PROTECTED]> wrote:
> Hello all,
>
> It appears that ZooKeeper is subject to the linux leap seconds bug that has caused problems with Cassandra and other services. At least, I discovered that after 6 hours of trying to figure out why my cluster wasn't giving me a quorum.
>
> A link to the kernel bug report is  at https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6b43ae8a619d17c4935c3320d2ef9e92bdeed05d
>
> As far as what you might see in your logs, I saw a lost quorum, insanely high load on my servers, and when I shut down zookeeper to bring it back up, one machine would report a read timeout during leader election, then report that the server told it to shut down. After that, it would forever be stuck in the LOOKING phase, while another machine might be stuck in any other phase of the election.
>
> The fix is simple, though. Just stop ZooKeeper, execute
>
> date -s "`date`"
>
> or restart your ntp daemon, then start zookeeper back up.
>
> you MUST restart zookeeper, otherwise, the election state doesn't recover (or, at least, it didn't recover for me)
>
> Hope this helps save someone else the 7 hours of agony I just went through.
>
> Scott Fines
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB