|
|
-
ZooKeeper Doesn't Quit When OOM OccurJi Zhang 2013-01-04, 03:17
Hi,
I'm using ZooKeeper 3.4.3, and yesterday one of the nodes is down due to OutOfMemory Error: 2013-01-03 18:36:58,566 [myid:3] - ERROR [CommitProcessor:3:CommitProcessor@148] - Unexpected exception causing CommitProcessor to exit java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2367) 2013-01-03 18:36:58,754 [myid:3] - INFO [CommitProcessor:3:CommitProcessor@150] - CommitProcessor exited loop! 2013-01-03 18:37:01,276 [myid:3] - ERROR [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$1@49] - Thread Thread[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181,5,main] died java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.HashMap.newKeyIterator(HashMap.java:853) Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "SyncThread:3" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "QuorumPeer[myid=3]/0.0.0.0:2181" Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main" 2013-01-03 18:37:03,343 [myid:3] - ERROR [SyncThread:3:SyncRequestProcessor@151] - Severe unrecoverable error, exiting 2013-01-03 18:37:04,465 [myid:3] - ERROR [SyncThread:3:NIOServerCnxnFactory$1@49] - Thread Thread[SyncThread:3,5,main] died 2013-01-03 18:41:43,477 [myid:3] - INFO [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started. Removing file: Jan 3, 2013 5:21:21 AM /var/zookeeper/version-2/log.4001d7bd9 Removing file: Jan 3, 2013 10:36:02 AM /var/zookeeper/version-2/log.4001ee156 Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "PurgeTask"\ Actually there are a lot of other stuff are running on this server, so I don't blame it for throwing OOM. But what bothers me is that when encountering OOME, ZooKeeper process doesn't quit. I'm using supervisord to monitor zk process, so if it does follow the fail-fast strategy, it'll be restarted afterwards. Any explanation for this? Thanks. |