-Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?
Ted Yu 2012-03-20, 17:09
I logged https://issues.apache.org/jira/browse/ZOOKEEPER-1428
If you feel there is anything missing in the JIRA, feel free to add it.
Thanks for your help on this issue.
On Tue, Mar 20, 2012 at 9:42 AM, Patrick Hunt <[EMAIL PROTECTED]> wrote:
> On Tue, Mar 20, 2012 at 9:32 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > Near term, if we can find out a way for shell script to detect the
> > of particular zookeeper node, rolling-restart.sh can be restored.
> > Otherwise we may need to remove it.
> I just tested this out with 3.4, and I see the following for statting
> a non-existant znode:
> [zk: (CONNECTED) 1] stat /foobar
> Node does not exist: /foobar
> vs statting one that does exist:
> [zk: (CONNECTED) 2] stat /
> cZxid = 0x0
> ctime = Wed Dec 31 16:00:00 PST 1969
> mZxid = 0x0
> mtime = Wed Dec 31 16:00:00 PST 1969
> pZxid = 0x0
> cversion = -1
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 0
> numChildren = 1
> You can look for "^Node does not exist" in the stat output instead of
> checking the exit code. This would get around the problem until a more
> permanent solution could be found.
> I hear you re time bound (i'd love to work on this myself). In that
> case, would you mind creating a jira based on my suggestion of having
> a new command line tool, give your hbase case as an example and any
> requirements you might think of. Perhaps Hartmut or one of the other
> contributors might be interested to work on this.
> > On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <[EMAIL PROTECTED]> wrote:
> >> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >> > I looked at the patch for ZOOKEEPER-1059 which should have converted
> >> > NPE to KeeperException.NoNodeException
> >> >
> >> > Why would 'zkcli stat' command return 0 in case hbase master znode
> >> expires ?
> >> >
> >> > Advice is appreciated.
> >> Hi Ted, sorry to see you're having troubles. I think I see the
> >> disconnect. ZooKeeperMain is first and foremost a user shell. As such
> >> it should not exit unless the quit command is run (or killed
> >> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
> >> the shell. It indeed is converting the NPE into a NoNodeException,
> >> which the shell then converts into an error message to the user, and
> >> continues. Prior to this patch the shell was failing on the NPE, which
> >> then generated the non-0 exit from the process.
> >> Note that trunk has some further improvements along these lines that
> >> you might also run into at some point in the future (3.5+):
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-271
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
> >> I think what we need is to have a tool that's intended for use both
> >> programmatically and by humans, with more strict requirements about
> >> input, output formatting and command handling, etc... Please see the
> >> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
> >> we can augment these new classes to also support such a tool. However
> >> it should instead be a true command line tool, rather than an shell.
> >> Would you be available to work on this?
> >> Patrick
> >> ps. bigtop is now helping to verify cross project compatibility, it
> >> would be great if you could introduce some hbase tests that would
> >> flag these breakages in future. When bigtop does it's integration (ie
> >> runs the hbase tests using the corresponding version of zk) it would
> >> find these problems. We'd catch it much earlier. Thanks!
> >> > FYI Jon filed a JIRA for the issue below which is a blocker for HBase
> >> trunk.
> >> >
> >> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <[EMAIL PROTECTED]>
> >> wrote:
> >> >
> >> >> I'm trying to test HBASE-5589 -- to see if I can add an API call to