Done: https://issues.apache.org/jira/browse/ZOOKEEPER-1442 . I'll try
to get a patch together in the near future. Thanks.
On 04/03/2012 06:32 PM, Michi Mutsuzaki wrote:
> I agree we shouldn't swallow java.lang.Error. Please go ahead and open a jira.
> From: Jeremy Stribling [[EMAIL PROTECTED]]
> Sent: Tuesday, April 03, 2012 4:51 PM
> To: [EMAIL PROTECTED]
> Subject: uncaught exception handler
> I'm curious about the origin of the uncaught exception handler that sits
> in NIOServerCnxn (looking at ZK 3.3.5). It just logs the exception to
> log.error. I wonder if it makes sense instead to do a System.exit(1) if
> the exception is an OutOfMemoryError (or perhaps a java.lang.Error in
> general, since those are not supposed to be caught).
> I ask because our use of Zookeeper embeds it in a process where some
> other code can cause the JVM to hit its memory limit. Instead of trying
> to soldier on in the face of adversity like this, it seems better for
> the whole process to come crashing down, to allow whatever monitor
> process is in place to restart the JVM. When the process just logs and
> ignores errors like this, it seems to lead to the ZK servers being
> unable to make a quorum, even though they are up and running.
> Here's a sample backtrace I've seen:
> 2012-04-03 19:40:03,643 600695063 [QuorumPeer:/172.29.1.220:2888] ERROR
> org.apache.zookeeper.server.NIOServerCnxn - Thread
> Thread[QuorumPeer:/172.29.1.220:2888,5,main] died
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:279)
> Any thoughts? Happy to create a JIRA and possibly a patch if there's
> interest. Thanks,