Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?


Copy link to this message
-
Re: ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?
Patrick:
Appreciate your detailed response.

I haven't finished work in ZOOKEEPER-1407 :-(
So I don't think I have bandwidth to start working on another zookeeper
issue.

Near term, if we can find out a way for shell script to detect the absence
of particular zookeeper node, rolling-restart.sh can be restored.
Otherwise we may need to remove it.

FYI As hbase committer, I often need to finish incomplete features such as
HBASE-3996.
This takes away significant amount of time.

Cheers

On Tue, Mar 20, 2012 at 9:16 AM, Patrick Hunt <[EMAIL PROTECTED]> wrote:

> On Tue, Mar 20, 2012 at 6:57 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > I looked at the patch for ZOOKEEPER-1059 which should have converted the
> > NPE to KeeperException.NoNodeException
> >
> > Why would 'zkcli stat' command return 0 in case hbase master znode
> expires ?
> >
> > Advice is appreciated.
>
> Hi Ted, sorry to see you're having troubles. I think I see the
> disconnect. ZooKeeperMain is first and foremost a user shell. As such
> it should not exit unless the quit command is run (or killed
> explicitly, etc...). In this case ZOOKEEPER-1059 is fixing a bug in
> the shell. It indeed is converting the NPE into a NoNodeException,
> which the shell then converts into an error message to the user, and
> continues. Prior to this patch the shell was failing on the NPE, which
> then generated the non-0 exit from the process.
>
> Note that trunk has some further improvements along these lines that
> you might also run into at some point in the future (3.5+):
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-271
> https://issues.apache.org/jira/browse/ZOOKEEPER-1391
> https://issues.apache.org/jira/browse/ZOOKEEPER-1307
>
> I think what we need is to have a tool that's intended for use both
> programmatically and by humans, with more strict requirements about
> input, output formatting and command handling, etc... Please see the
> work Hartmut has been doing as part of 271 on trunk (3.5.0). Perhaps
> we can augment these new classes to also support such a tool. However
> it should instead be a true command line tool, rather than an shell.
> Would you be available to work on this?
>
> Patrick
>
> ps. bigtop is now helping to verify cross project compatibility, it
> would be great if you could introduce some hbase tests  that would
> flag these breakages in future. When bigtop does it's integration (ie
> runs the hbase tests using the corresponding version of zk) it would
> find these problems. We'd catch it much earlier. Thanks!
>
>
> > FYI Jon filed a JIRA for the issue below which is a blocker for HBase
> trunk.
> >
> > On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <[EMAIL PROTECTED]>
> wrote:
> >
> >> I'm trying to test HBASE-5589 -- to see if I can add an API call to
> >> HMasterInterface and do a rolling-restart / upgrade on a live cluster
> which
> >> lead me down another rabbit hole.
> >>
> >> I'm wondering how rolling-restart.sh script worked in the past (I can
> spend
> >> more time setting up an older version to test this, but figured I'd
> ask).
> >>
> >> I'm getting stuck when the bin/rolling-restart.sh tries to wait until
> the
> >> Master ZNode expires.  In this particular case, the script seems to hang
> >> there forever (even after the /hbase/master ephemeral node expires).
> >>
> >> Here's the code in the script:
> >> ----
> >> # make sure the master znode has been deleted before continuing
> >>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> zookeeper.znode.parent`
> >>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
> >>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> >> zookeeper.znode.master`
> >>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
> >>    zmaster=$zparent/$zmaster
> >>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
> >>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
> >>      echo -n "."
> >>      sleep 1
> >>    done
> >>    echo #force a newline
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB