Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # dev >> Input on a change

Copy link to this message
Input on a change
Hi everyone,

I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd
like some feedback from the user base on it.

The current behavior of ZK when we get an uncaught exception is to log it
and try to move on. This is arguably not the right thing to do, and will
possibly cause ZK to limp along with a bad VM (say, in an OOM state) for
longer than it should.
The patch proposes that when we get an instance of java.lang.Error, we
should do a system.exit to fast-fail the process. With the possible
exception of ThreadDeath (which may or may not be an unrecoverable system
state depending on the thread), I think this makes sense, but I would like
to hear from others if they have an opinion. I think it's better to kill
the process and let your monitoring services detect process death (and thus
restart) than possibly linger unresponsive for a while, are there scenarios
that we're missing where this error can occur and you wouldn't want the
process killed?

Thanks for your feedback,

Scott Fines 2012-04-13, 15:15
Michi Mutsuzaki 2012-04-13, 20:19
Jeremy Stribling 2012-04-13, 21:25
Michi Mutsuzaki 2012-04-13, 21:31
Michi Mutsuzaki 2012-04-13, 21:38
Camille Fournier 2012-04-15, 18:28
Ishaaq Chandy 2012-04-16, 03:20
Camille Fournier 2012-04-16, 12:55
Ishaaq Chandy 2012-04-16, 16:52
Camille Fournier 2012-04-16, 17:13