Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> uncaught exception handler


Copy link to this message
-
Re: uncaught exception handler
Done: https://issues.apache.org/jira/browse/ZOOKEEPER-1442 .  I'll try
to get a patch together in the near future.  Thanks.

Jeremy

On 04/03/2012 06:32 PM, Michi Mutsuzaki wrote:
> I agree we shouldn't swallow java.lang.Error. Please go ahead and open a jira.
>
> Thanks!
> --Michi
> ________________________________________
> From: Jeremy Stribling [[EMAIL PROTECTED]]
> Sent: Tuesday, April 03, 2012 4:51 PM
> To: [EMAIL PROTECTED]
> Subject: uncaught exception handler
>
> I'm curious about the origin of the uncaught exception handler that sits
> in NIOServerCnxn (looking at ZK 3.3.5).  It just logs the exception to
> log.error.  I wonder if it makes sense instead to do a System.exit(1) if
> the exception is an OutOfMemoryError (or perhaps a java.lang.Error in
> general, since those are not supposed to be caught).
>
> I ask because our use of Zookeeper embeds it in a process where some
> other code can cause the JVM to hit its memory limit.  Instead of trying
> to soldier on in the face of adversity like this, it seems better for
> the whole process to come crashing down, to allow whatever monitor
> process is in place to restart the JVM.  When the process just logs and
> ignores errors like this, it seems to lead to the ZK servers being
> unable to make a quorum, even though they are up and running.
>
> Here's a sample backtrace I've seen:
>
> 2012-04-03 19:40:03,643 600695063 [QuorumPeer:/172.29.1.220:2888] ERROR
> org.apache.zookeeper.server.NIOServerCnxn  - Thread
> Thread[QuorumPeer:/172.29.1.220:2888,5,main] died
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>           at
> org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:102)
>           at
> org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:232)
>           at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:602)
>           at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
>           at
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.<init>(FileTxnLog.java:504)
>           at
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
>           at
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:131)
>           at
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:222)
>           at
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:242)
>           at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:279)
>           at
> org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:658)
>
> Any thoughts?  Happy to create a JIRA and possibly a patch if there's
> interest.  Thanks,
>
> Jeremy