Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Connection closed exceptions with slow fsync and CancelledKeyExceptions

Copy link to this message
Connection closed exceptions with slow fsync and CancelledKeyExceptions
We have been trying to understand why our ZooKeeper cluster will occasionally
have a wave of connection closed exceptions. We have switched to
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode for garbage collection with
no noticeable improvements.

The symptoms are:

(1) All nodes show messages like "fsync-ing the write ahead log in
SyncThread:0 took 6309ms which will adversely effect operation latency. See
the ZooKeeper troubleshooting guide" with times typically around 5 seconds.
At least once, this fsync appeared in the leaders log immediately before a
wave of:

ERROR [CommitProcessor:0:NIOServerCnxn@445] - Unexpected Exception:
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:73)
at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:77)

Our clients received ZookeeperConnectionClosed exceptions at this time and
all traffic on the ZooKeeper cluster essentially went to zero for a moment
before resuming normal operation with new connections.

(2) Probably unrelated since I haven't correlated it temporally with the
client errors, but running "sudo strace -r -T -f -p 9574 -e
trace=fsync,fdatasync -o trace.txt" turns up some messages like "10581
0.000246 — SIGSEGV (Segmentation fault) @ 0 (0) —"
ZK Version: 3.3.4
Cluster has 5 nodes running in EC2

Here is a screenshot showing ZooKeeper network traffic going to zero at the
time of the connection closed exceptions: http://i.imgur.com/dfNh0.png

Anyone have ideas on what the cause of these "waves" of
CancelledKeyExceptions could be from?

View this message in context: http://zookeeper-user.578899.n2.nabble.com/Connection-closed-exceptions-with-slow-fsync-and-CancelledKeyExceptions-tp7578166.html
Sent from the zookeeper-user mailing list archive at Nabble.com.
Vitalii Tymchyshyn 2012-11-12, 00:12