|
|
Camille Fournier 2012-04-13, 15:09
Hi everyone, I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd like some feedback from the user base on it. https://issues.apache.org/jira/browse/ZOOKEEPER-1442The current behavior of ZK when we get an uncaught exception is to log it and try to move on. This is arguably not the right thing to do, and will possibly cause ZK to limp along with a bad VM (say, in an OOM state) for longer than it should. The patch proposes that when we get an instance of java.lang.Error, we should do a system.exit to fast-fail the process. With the possible exception of ThreadDeath (which may or may not be an unrecoverable system state depending on the thread), I think this makes sense, but I would like to hear from others if they have an opinion. I think it's better to kill the process and let your monitoring services detect process death (and thus restart) than possibly linger unresponsive for a while, are there scenarios that we're missing where this error can occur and you wouldn't want the process killed? Thanks for your feedback, Camille
+
Camille Fournier 2012-04-13, 15:09
Scott Fines 2012-04-13, 15:15
On some JVMs (the HotSpot for sure, but maybe others too?) there's a JVM for performing actions on OutOfMemoryErrors (-XX:OnOutOfMemoryError="<cmd args>, -XX:+HeapDumpOnOutOfMemoryError and maybe some others that I can't remember off the top of my head). Will these triggers still be fired, or will the catch-all prevent them? I'm still +1 for the change no matter what, but it's probably a good idea to mention that in the docs if they don't work. Scott On Fri, Apr 13, 2012 at 10:09 AM, Camille Fournier <[EMAIL PROTECTED]>wrote: > Hi everyone, > > I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd > like some feedback from the user base on it. > https://issues.apache.org/jira/browse/ZOOKEEPER-1442> > The current behavior of ZK when we get an uncaught exception is to log it > and try to move on. This is arguably not the right thing to do, and will > possibly cause ZK to limp along with a bad VM (say, in an OOM state) for > longer than it should. > The patch proposes that when we get an instance of java.lang.Error, we > should do a system.exit to fast-fail the process. With the possible > exception of ThreadDeath (which may or may not be an unrecoverable system > state depending on the thread), I think this makes sense, but I would like > to hear from others if they have an opinion. I think it's better to kill > the process and let your monitoring services detect process death (and thus > restart) than possibly linger unresponsive for a while, are there scenarios > that we're missing where this error can occur and you wouldn't want the > process killed? > > Thanks for your feedback, > > Camille >
+
Scott Fines 2012-04-13, 15:15
Michi Mutsuzaki 2012-04-13, 20:19
I agree with both Scott's and Ryan's points. I think it makes to: 1. Make this behavior configurable (with default being "turned off") to preserve backward compatibility. 2. Re-throw the exception instead of exiting with System.exit(1) so that users can use flags like -XX:+HeapDumpOnOutOfMemoryError. --Michi ________________________________________ From: Scott Fines [[EMAIL PROTECTED]] Sent: Friday, April 13, 2012 8:15 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: Input on a change On some JVMs (the HotSpot for sure, but maybe others too?) there's a JVM for performing actions on OutOfMemoryErrors (-XX:OnOutOfMemoryError="<cmd args>, -XX:+HeapDumpOnOutOfMemoryError and maybe some others that I can't remember off the top of my head). Will these triggers still be fired, or will the catch-all prevent them? I'm still +1 for the change no matter what, but it's probably a good idea to mention that in the docs if they don't work. Scott On Fri, Apr 13, 2012 at 10:09 AM, Camille Fournier <[EMAIL PROTECTED]>wrote: > Hi everyone, > > I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd > like some feedback from the user base on it. > https://issues.apache.org/jira/browse/ZOOKEEPER-1442> > The current behavior of ZK when we get an uncaught exception is to log it > and try to move on. This is arguably not the right thing to do, and will > possibly cause ZK to limp along with a bad VM (say, in an OOM state) for > longer than it should. > The patch proposes that when we get an instance of java.lang.Error, we > should do a system.exit to fast-fail the process. With the possible > exception of ThreadDeath (which may or may not be an unrecoverable system > state depending on the thread), I think this makes sense, but I would like > to hear from others if they have an opinion. I think it's better to kill > the process and let your monitoring services detect process death (and thus > restart) than possibly linger unresponsive for a while, are there scenarios > that we're missing where this error can occur and you wouldn't want the > process killed? > > Thanks for your feedback, > > Camille >
+
Michi Mutsuzaki 2012-04-13, 20:19
Jeremy Stribling 2012-04-13, 21:25
On 04/13/2012 01:19 PM, Michi Mutsuzaki wrote: > I agree with both Scott's and Ryan's points. I think it makes to: > > 1. Make this behavior configurable (with default being "turned off") to preserve backward compatibility. > 2. Re-throw the exception instead of exiting with System.exit(1) so that users can use flags like -XX:+HeapDumpOnOutOfMemoryError. I don't think re-throwing exceptions from an uncaught exception handler is an option: http://docs.oracle.com/javase/6/docs/api/java/lang/Thread.UncaughtExceptionHandler.html#uncaughtException%28java.lang.Thread,%20java.lang.Throwable%29" Any exception thrown by this method will be ignored by the Java Virtual Machine." > --Michi > ________________________________________ > From: Scott Fines [[EMAIL PROTECTED]] > Sent: Friday, April 13, 2012 8:15 AM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: Re: Input on a change > > On some JVMs (the HotSpot for sure, but maybe others too?) there's a JVM > for performing actions on OutOfMemoryErrors (-XX:OnOutOfMemoryError="<cmd > args>, -XX:+HeapDumpOnOutOfMemoryError and maybe some others that I can't > remember off the top of my head). Will these triggers still be fired, or > will the catch-all prevent them? > > I'm still +1 for the change no matter what, but it's probably a good idea > to mention that in the docs if they don't work. > > Scott > > On Fri, Apr 13, 2012 at 10:09 AM, Camille Fournier<[EMAIL PROTECTED]>wrote: > >> Hi everyone, >> >> I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd >> like some feedback from the user base on it. >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442>> >> The current behavior of ZK when we get an uncaught exception is to log it >> and try to move on. This is arguably not the right thing to do, and will >> possibly cause ZK to limp along with a bad VM (say, in an OOM state) for >> longer than it should. >> The patch proposes that when we get an instance of java.lang.Error, we >> should do a system.exit to fast-fail the process. With the possible >> exception of ThreadDeath (which may or may not be an unrecoverable system >> state depending on the thread), I think this makes sense, but I would like >> to hear from others if they have an opinion. I think it's better to kill >> the process and let your monitoring services detect process death (and thus >> restart) than possibly linger unresponsive for a while, are there scenarios >> that we're missing where this error can occur and you wouldn't want the >> process killed? >> >> Thanks for your feedback, >> >> Camille >>
+
Jeremy Stribling 2012-04-13, 21:25
Michi Mutsuzaki 2012-04-13, 21:31
Ah I see. I guess log and System.exit(1) is the best we can do then. --Michi On Fri, Apr 13, 2012 at 2:25 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote: > On 04/13/2012 01:19 PM, Michi Mutsuzaki wrote: >> >> I agree with both Scott's and Ryan's points. I think it makes to: >> >> 1. Make this behavior configurable (with default being "turned off") to >> preserve backward compatibility. >> 2. Re-throw the exception instead of exiting with System.exit(1) so that >> users can use flags like -XX:+HeapDumpOnOutOfMemoryError. > > > I don't think re-throwing exceptions from an uncaught exception handler is > an option: > http://docs.oracle.com/javase/6/docs/api/java/lang/Thread.UncaughtExceptionHandler.html#uncaughtException%28java.lang.Thread,%20java.lang.Throwable%29> > " Any exception thrown by this method will be ignored by the Java Virtual > Machine." > > >> --Michi >> ________________________________________ >> From: Scott Fines [[EMAIL PROTECTED]] >> Sent: Friday, April 13, 2012 8:15 AM >> To: [EMAIL PROTECTED] >> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] >> Subject: Re: Input on a change >> >> On some JVMs (the HotSpot for sure, but maybe others too?) there's a JVM >> for performing actions on OutOfMemoryErrors (-XX:OnOutOfMemoryError="<cmd >> args>, -XX:+HeapDumpOnOutOfMemoryError and maybe some others that I can't >> remember off the top of my head). Will these triggers still be fired, or >> will the catch-all prevent them? >> >> I'm still +1 for the change no matter what, but it's probably a good idea >> to mention that in the docs if they don't work. >> >> Scott >> >> On Fri, Apr 13, 2012 at 10:09 AM, Camille >> Fournier<[EMAIL PROTECTED]>wrote: >> >>> Hi everyone, >>> >>> I'm trying to evaluate a patch that Jeremy Stribling has submitted, and >>> I'd >>> like some feedback from the user base on it. >>> https://issues.apache.org/jira/browse/ZOOKEEPER-1442>>> >>> The current behavior of ZK when we get an uncaught exception is to log it >>> and try to move on. This is arguably not the right thing to do, and will >>> possibly cause ZK to limp along with a bad VM (say, in an OOM state) for >>> longer than it should. >>> The patch proposes that when we get an instance of java.lang.Error, we >>> should do a system.exit to fast-fail the process. With the possible >>> exception of ThreadDeath (which may or may not be an unrecoverable system >>> state depending on the thread), I think this makes sense, but I would >>> like >>> to hear from others if they have an opinion. I think it's better to kill >>> the process and let your monitoring services detect process death (and >>> thus >>> restart) than possibly linger unresponsive for a while, are there >>> scenarios >>> that we're missing where this error can occur and you wouldn't want the >>> process killed? >>> >>> Thanks for your feedback, >>> >>> Camille >>> >
+
Michi Mutsuzaki 2012-04-13, 21:31
Michi Mutsuzaki 2012-04-13, 21:38
By the way, grepping the source code, I see we use various exit codes. Does anybody know what they all mean? --Michi On Fri, Apr 13, 2012 at 2:31 PM, Michi Mutsuzaki <[EMAIL PROTECTED]> wrote: > Ah I see. I guess log and System.exit(1) is the best we can do then. > > --Michi > > On Fri, Apr 13, 2012 at 2:25 PM, Jeremy Stribling <[EMAIL PROTECTED]> wrote: >> On 04/13/2012 01:19 PM, Michi Mutsuzaki wrote: >>> >>> I agree with both Scott's and Ryan's points. I think it makes to: >>> >>> 1. Make this behavior configurable (with default being "turned off") to >>> preserve backward compatibility. >>> 2. Re-throw the exception instead of exiting with System.exit(1) so that >>> users can use flags like -XX:+HeapDumpOnOutOfMemoryError. >> >> >> I don't think re-throwing exceptions from an uncaught exception handler is >> an option: >> http://docs.oracle.com/javase/6/docs/api/java/lang/Thread.UncaughtExceptionHandler.html#uncaughtException%28java.lang.Thread,%20java.lang.Throwable%29>> >> " Any exception thrown by this method will be ignored by the Java Virtual >> Machine." >> >> >>> --Michi >>> ________________________________________ >>> From: Scott Fines [[EMAIL PROTECTED]] >>> Sent: Friday, April 13, 2012 8:15 AM >>> To: [EMAIL PROTECTED] >>> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] >>> Subject: Re: Input on a change >>> >>> On some JVMs (the HotSpot for sure, but maybe others too?) there's a JVM >>> for performing actions on OutOfMemoryErrors (-XX:OnOutOfMemoryError="<cmd >>> args>, -XX:+HeapDumpOnOutOfMemoryError and maybe some others that I can't >>> remember off the top of my head). Will these triggers still be fired, or >>> will the catch-all prevent them? >>> >>> I'm still +1 for the change no matter what, but it's probably a good idea >>> to mention that in the docs if they don't work. >>> >>> Scott >>> >>> On Fri, Apr 13, 2012 at 10:09 AM, Camille >>> Fournier<[EMAIL PROTECTED]>wrote: >>> >>>> Hi everyone, >>>> >>>> I'm trying to evaluate a patch that Jeremy Stribling has submitted, and >>>> I'd >>>> like some feedback from the user base on it. >>>> https://issues.apache.org/jira/browse/ZOOKEEPER-1442>>>> >>>> The current behavior of ZK when we get an uncaught exception is to log it >>>> and try to move on. This is arguably not the right thing to do, and will >>>> possibly cause ZK to limp along with a bad VM (say, in an OOM state) for >>>> longer than it should. >>>> The patch proposes that when we get an instance of java.lang.Error, we >>>> should do a system.exit to fast-fail the process. With the possible >>>> exception of ThreadDeath (which may or may not be an unrecoverable system >>>> state depending on the thread), I think this makes sense, but I would >>>> like >>>> to hear from others if they have an opinion. I think it's better to kill >>>> the process and let your monitoring services detect process death (and >>>> thus >>>> restart) than possibly linger unresponsive for a while, are there >>>> scenarios >>>> that we're missing where this error can occur and you wouldn't want the >>>> process killed? >>>> >>>> Thanks for your feedback, >>>> >>>> Camille >>>> >>
+
Michi Mutsuzaki 2012-04-13, 21:38
Camille Fournier 2012-04-15, 18:28
This is a good point. I think this change should be fine for the server portion of the code, since it's designed to be run as a standalone system. But for the client connection to also call system.exit on such an error is overreaching for all the reasons listed below. C 2012/4/15 Віталій Тимчишин <[EMAIL PROTECTED]>: > I really would not like for any library to perform a System.exit call. This > would make huge program exit out of sudden (think about j2ee, you may be > bitten by security manager). Note that there are more or less safe errors, > like StackOverflowError. > Also System.exit make testing nightmare. E.g. maven2 silently skips any > tests after the one that calls System.exit. And everything's green. > As for me good options are: > 1) Call user-provided uncaught exception handler. Use the one from the > thread that created the connection if one is not specified explicity. > 1) Stop everything, notifying user with a global watcher. If it's possible, > clean any static state (e.g. restart threads) and allow to restart > connection. > In any case, call user code. Good system already know how to react (it may > want to send email to admin), allow it to perform well. > > Best regards, Vitalii Tymchyshyn. > > 2012/4/13 Camille Fournier <[EMAIL PROTECTED]> > >> Hi everyone, >> >> I'm trying to evaluate a patch that Jeremy Stribling has submitted, and I'd >> like some feedback from the user base on it. >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442>> >> The current behavior of ZK when we get an uncaught exception is to log it >> and try to move on. This is arguably not the right thing to do, and will >> possibly cause ZK to limp along with a bad VM (say, in an OOM state) for >> longer than it should. >> The patch proposes that when we get an instance of java.lang.Error, we >> should do a system.exit to fast-fail the process. With the possible >> exception of ThreadDeath (which may or may not be an unrecoverable system >> state depending on the thread), I think this makes sense, but I would like >> to hear from others if they have an opinion. I think it's better to kill >> the process and let your monitoring services detect process death (and thus >> restart) than possibly linger unresponsive for a while, are there scenarios >> that we're missing where this error can occur and you wouldn't want the >> process killed? >> >> Thanks for your feedback, >> >> Camille >> > > > > -- > Best regards, > Vitalii Tymchyshyn
+
Camille Fournier 2012-04-15, 18:28
Ishaaq Chandy 2012-04-16, 03:20
I'd go so far as to say that even the server-code should avoid System.exit. Just because it is "meant" to be a standalone system doesn't mean that code that makes it impossible to embed it should be encouraged. For e.g, we embed a local version of ZK to be used inside our unit tests. This makes it much easier for us to control ZK to coincide with test expectations as well as making for much faster build times. It would be a shame if the embedded ZK started killing the JVM. Ishaaq On 16 April 2012 04:28, Camille Fournier <[EMAIL PROTECTED]> wrote: > This is a good point. > I think this change should be fine for the server portion of the code, > since it's designed to be run as a standalone system. But for the > client connection to also call system.exit on such an error is > overreaching for all the reasons listed below. > > C > > 2012/4/15 Віталій Тимчишин <[EMAIL PROTECTED]>: > > I really would not like for any library to perform a System.exit call. > This > > would make huge program exit out of sudden (think about j2ee, you may be > > bitten by security manager). Note that there are more or less safe > errors, > > like StackOverflowError. > > Also System.exit make testing nightmare. E.g. maven2 silently skips any > > tests after the one that calls System.exit. And everything's green. > > As for me good options are: > > 1) Call user-provided uncaught exception handler. Use the one from the > > thread that created the connection if one is not specified explicity. > > 1) Stop everything, notifying user with a global watcher. If it's > possible, > > clean any static state (e.g. restart threads) and allow to restart > > connection. > > In any case, call user code. Good system already know how to react (it > may > > want to send email to admin), allow it to perform well. > > > > Best regards, Vitalii Tymchyshyn. > > > > 2012/4/13 Camille Fournier <[EMAIL PROTECTED]> > > > >> Hi everyone, > >> > >> I'm trying to evaluate a patch that Jeremy Stribling has submitted, and > I'd > >> like some feedback from the user base on it. > >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442> >> > >> The current behavior of ZK when we get an uncaught exception is to log > it > >> and try to move on. This is arguably not the right thing to do, and will > >> possibly cause ZK to limp along with a bad VM (say, in an OOM state) for > >> longer than it should. > >> The patch proposes that when we get an instance of java.lang.Error, we > >> should do a system.exit to fast-fail the process. With the possible > >> exception of ThreadDeath (which may or may not be an unrecoverable > system > >> state depending on the thread), I think this makes sense, but I would > like > >> to hear from others if they have an opinion. I think it's better to kill > >> the process and let your monitoring services detect process death (and > thus > >> restart) than possibly linger unresponsive for a while, are there > scenarios > >> that we're missing where this error can occur and you wouldn't want the > >> process killed? > >> > >> Thanks for your feedback, > >> > >> Camille > >> > > > > > > > > -- > > Best regards, > > Vitalii Tymchyshyn >
+
Ishaaq Chandy 2012-04-16, 03:20
Camille Fournier 2012-04-16, 12:55
I believe that this change is inspired by someone that runs zk embedded. Personally I'm not moved by the testing argument, embedding the server for testing is a bit of an anti pattern in my mind. >From my phone On Apr 15, 2012 11:20 PM, "Ishaaq Chandy" <[EMAIL PROTECTED]> wrote: > I'd go so far as to say that even the server-code should avoid System.exit. > Just because it is "meant" to be a standalone system doesn't mean that code > that makes it impossible to embed it should be encouraged. > > For e.g, we embed a local version of ZK to be used inside our unit tests. > This makes it much easier for us to control ZK to coincide with test > expectations as well as making for much faster build times. It would be a > shame if the embedded ZK started killing the JVM. > > Ishaaq > > On 16 April 2012 04:28, Camille Fournier <[EMAIL PROTECTED]> wrote: > > > This is a good point. > > I think this change should be fine for the server portion of the code, > > since it's designed to be run as a standalone system. But for the > > client connection to also call system.exit on such an error is > > overreaching for all the reasons listed below. > > > > C > > > > 2012/4/15 Віталій Тимчишин <[EMAIL PROTECTED]>: > > > I really would not like for any library to perform a System.exit call. > > This > > > would make huge program exit out of sudden (think about j2ee, you may > be > > > bitten by security manager). Note that there are more or less safe > > errors, > > > like StackOverflowError. > > > Also System.exit make testing nightmare. E.g. maven2 silently skips any > > > tests after the one that calls System.exit. And everything's green. > > > As for me good options are: > > > 1) Call user-provided uncaught exception handler. Use the one from the > > > thread that created the connection if one is not specified explicity. > > > 1) Stop everything, notifying user with a global watcher. If it's > > possible, > > > clean any static state (e.g. restart threads) and allow to restart > > > connection. > > > In any case, call user code. Good system already know how to react (it > > may > > > want to send email to admin), allow it to perform well. > > > > > > Best regards, Vitalii Tymchyshyn. > > > > > > 2012/4/13 Camille Fournier <[EMAIL PROTECTED]> > > > > > >> Hi everyone, > > >> > > >> I'm trying to evaluate a patch that Jeremy Stribling has submitted, > and > > I'd > > >> like some feedback from the user base on it. > > >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442> > >> > > >> The current behavior of ZK when we get an uncaught exception is to log > > it > > >> and try to move on. This is arguably not the right thing to do, and > will > > >> possibly cause ZK to limp along with a bad VM (say, in an OOM state) > for > > >> longer than it should. > > >> The patch proposes that when we get an instance of java.lang.Error, we > > >> should do a system.exit to fast-fail the process. With the possible > > >> exception of ThreadDeath (which may or may not be an unrecoverable > > system > > >> state depending on the thread), I think this makes sense, but I would > > like > > >> to hear from others if they have an opinion. I think it's better to > kill > > >> the process and let your monitoring services detect process death (and > > thus > > >> restart) than possibly linger unresponsive for a while, are there > > scenarios > > >> that we're missing where this error can occur and you wouldn't want > the > > >> process killed? > > >> > > >> Thanks for your feedback, > > >> > > >> Camille > > >> > > > > > > > > > > > > -- > > > Best regards, > > > Vitalii Tymchyshyn > > >
+
Camille Fournier 2012-04-16, 12:55
Ishaaq Chandy 2012-04-16, 16:52
Fair enough, if you think it is an anti-pattern then perhaps you can suggest an alternative that allows one to write tests that are (in descending order of priority): 1. Quick and easy to setup and teardown 2. Repeatably testable and debuggable in an IDE without having to resort to the external build tool 3. Testable in parallel, i.e. having more than one build running on a CI server, so need some way to avoid port clashes 4. Cross-platform - i.e. run the exact same build sequence on multiple OSes. Ishaaq On 16 April 2012 22:55, Camille Fournier <[EMAIL PROTECTED]> wrote: > I believe that this change is inspired by someone that runs zk embedded. > Personally I'm not moved by the testing argument, embedding the server for > testing is a bit of an anti pattern in my mind. > > From my phone > On Apr 15, 2012 11:20 PM, "Ishaaq Chandy" <[EMAIL PROTECTED]> wrote: > > > I'd go so far as to say that even the server-code should avoid > System.exit. > > Just because it is "meant" to be a standalone system doesn't mean that > code > > that makes it impossible to embed it should be encouraged. > > > > For e.g, we embed a local version of ZK to be used inside our unit tests. > > This makes it much easier for us to control ZK to coincide with test > > expectations as well as making for much faster build times. It would be a > > shame if the embedded ZK started killing the JVM. > > > > Ishaaq > > > > On 16 April 2012 04:28, Camille Fournier <[EMAIL PROTECTED]> wrote: > > > > > This is a good point. > > > I think this change should be fine for the server portion of the code, > > > since it's designed to be run as a standalone system. But for the > > > client connection to also call system.exit on such an error is > > > overreaching for all the reasons listed below. > > > > > > C > > > > > > 2012/4/15 Віталій Тимчишин <[EMAIL PROTECTED]>: > > > > I really would not like for any library to perform a System.exit > call. > > > This > > > > would make huge program exit out of sudden (think about j2ee, you may > > be > > > > bitten by security manager). Note that there are more or less safe > > > errors, > > > > like StackOverflowError. > > > > Also System.exit make testing nightmare. E.g. maven2 silently skips > any > > > > tests after the one that calls System.exit. And everything's green. > > > > As for me good options are: > > > > 1) Call user-provided uncaught exception handler. Use the one from > the > > > > thread that created the connection if one is not specified explicity. > > > > 1) Stop everything, notifying user with a global watcher. If it's > > > possible, > > > > clean any static state (e.g. restart threads) and allow to restart > > > > connection. > > > > In any case, call user code. Good system already know how to react > (it > > > may > > > > want to send email to admin), allow it to perform well. > > > > > > > > Best regards, Vitalii Tymchyshyn. > > > > > > > > 2012/4/13 Camille Fournier <[EMAIL PROTECTED]> > > > > > > > >> Hi everyone, > > > >> > > > >> I'm trying to evaluate a patch that Jeremy Stribling has submitted, > > and > > > I'd > > > >> like some feedback from the user base on it. > > > >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442> > > >> > > > >> The current behavior of ZK when we get an uncaught exception is to > log > > > it > > > >> and try to move on. This is arguably not the right thing to do, and > > will > > > >> possibly cause ZK to limp along with a bad VM (say, in an OOM state) > > for > > > >> longer than it should. > > > >> The patch proposes that when we get an instance of java.lang.Error, > we > > > >> should do a system.exit to fast-fail the process. With the possible > > > >> exception of ThreadDeath (which may or may not be an unrecoverable > > > system > > > >> state depending on the thread), I think this makes sense, but I > would > > > like > > > >> to hear from others if they have an opinion. I think it's better to > > kill > > > >> the process and let your monitoring services detect process death
+
Ishaaq Chandy 2012-04-16, 16:52
Camille Fournier 2012-04-16, 17:13
Sure, mocking it's a helluva lot faster too. C >From my phone On Apr 16, 2012 12:54 PM, "Ishaaq Chandy" <[EMAIL PROTECTED]> wrote: > Fair enough, if you think it is an anti-pattern then perhaps you can > suggest an alternative that allows one to write tests that are (in > descending order of priority): > > 1. Quick and easy to setup and teardown > 2. Repeatably testable and debuggable in an IDE without having to resort to > the external build tool > 3. Testable in parallel, i.e. having more than one build running on a CI > server, so need some way to avoid port clashes > 4. Cross-platform - i.e. run the exact same build sequence on multiple > OSes. > > Ishaaq > > On 16 April 2012 22:55, Camille Fournier <[EMAIL PROTECTED]> wrote: > > > I believe that this change is inspired by someone that runs zk embedded. > > Personally I'm not moved by the testing argument, embedding the server > for > > testing is a bit of an anti pattern in my mind. > > > > From my phone > > On Apr 15, 2012 11:20 PM, "Ishaaq Chandy" <[EMAIL PROTECTED]> wrote: > > > > > I'd go so far as to say that even the server-code should avoid > > System.exit. > > > Just because it is "meant" to be a standalone system doesn't mean that > > code > > > that makes it impossible to embed it should be encouraged. > > > > > > For e.g, we embed a local version of ZK to be used inside our unit > tests. > > > This makes it much easier for us to control ZK to coincide with test > > > expectations as well as making for much faster build times. It would > be a > > > shame if the embedded ZK started killing the JVM. > > > > > > Ishaaq > > > > > > On 16 April 2012 04:28, Camille Fournier <[EMAIL PROTECTED]> wrote: > > > > > > > This is a good point. > > > > I think this change should be fine for the server portion of the > code, > > > > since it's designed to be run as a standalone system. But for the > > > > client connection to also call system.exit on such an error is > > > > overreaching for all the reasons listed below. > > > > > > > > C > > > > > > > > 2012/4/15 Віталій Тимчишин <[EMAIL PROTECTED]>: > > > > > I really would not like for any library to perform a System.exit > > call. > > > > This > > > > > would make huge program exit out of sudden (think about j2ee, you > may > > > be > > > > > bitten by security manager). Note that there are more or less safe > > > > errors, > > > > > like StackOverflowError. > > > > > Also System.exit make testing nightmare. E.g. maven2 silently skips > > any > > > > > tests after the one that calls System.exit. And everything's green. > > > > > As for me good options are: > > > > > 1) Call user-provided uncaught exception handler. Use the one from > > the > > > > > thread that created the connection if one is not specified > explicity. > > > > > 1) Stop everything, notifying user with a global watcher. If it's > > > > possible, > > > > > clean any static state (e.g. restart threads) and allow to restart > > > > > connection. > > > > > In any case, call user code. Good system already know how to react > > (it > > > > may > > > > > want to send email to admin), allow it to perform well. > > > > > > > > > > Best regards, Vitalii Tymchyshyn. > > > > > > > > > > 2012/4/13 Camille Fournier <[EMAIL PROTECTED]> > > > > > > > > > >> Hi everyone, > > > > >> > > > > >> I'm trying to evaluate a patch that Jeremy Stribling has > submitted, > > > and > > > > I'd > > > > >> like some feedback from the user base on it. > > > > >> https://issues.apache.org/jira/browse/ZOOKEEPER-1442> > > > >> > > > > >> The current behavior of ZK when we get an uncaught exception is to > > log > > > > it > > > > >> and try to move on. This is arguably not the right thing to do, > and > > > will > > > > >> possibly cause ZK to limp along with a bad VM (say, in an OOM > state) > > > for > > > > >> longer than it should. > > > > >> The patch proposes that when we get an instance of > java.lang.Error, > > we > > > > >> should do a system.exit to fast-fail the process. With the
+
Camille Fournier 2012-04-16, 17:13
|
|