Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?


Copy link to this message
-
ZOOKEEPER-1059 Was: Does the rolling-restart.sh script work?
I looked at the patch for ZOOKEEPER-1059 which should have converted the
NPE to KeeperException.NoNodeException

Why would 'zkcli stat' command return 0 in case hbase master znode expires ?

Advice is appreciated.

FYI Jon filed a JIRA for the issue below which is a blocker for HBase trunk.

On Tue, Mar 20, 2012 at 12:36 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:

> I'm trying to test HBASE-5589 -- to see if I can add an API call to
> HMasterInterface and do a rolling-restart / upgrade on a live cluster which
> lead me down another rabbit hole.
>
> I'm wondering how rolling-restart.sh script worked in the past (I can spend
> more time setting up an older version to test this, but figured I'd ask).
>
> I'm getting stuck when the bin/rolling-restart.sh tries to wait until the
> Master ZNode expires.  In this particular case, the script seems to hang
> there forever (even after the /hbase/master ephemeral node expires).
>
> Here's the code in the script:
> ----
> # make sure the master znode has been deleted before continuing
>    zparent=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> zookeeper.znode.parent`
>    if [ "$zparent" == "null" ]; then zparent="/hbase"; fi
>    zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
> zookeeper.znode.master`
>    if [ "$zmaster" == "null" ]; then zmaster="master"; fi
>    zmaster=$zparent/$zmaster
>    echo -n "Waiting for Master ZNode ${zmaster} to expire"
>    while bin/hbase zkcli stat $zmaster >/dev/null 2>&1; do
>      echo -n "."
>      sleep 1
>    done
>    echo #force a newline
> ----
>
> The problem is that 'bin/hbase zkcli stat /hbase/master ...' seems to
> always returns with $? == 0 regardless if the znode is present or not
> present!  I've checked with Patrick Hunt (ZK committer) and this the
> expected behavior.  The only non-zero retcodes are for abnormal exits
> (exceptions thrown)
>
> Here's the ZK code I was looking through
>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java#L736
>
>
> https://github.com/apache/zookeeper/blob/release-3.4.3/src/java/main/org/apache/zookeeper/ZooKeeper.java#L980
>
>
> Thoughts?
>
> Jon.
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // [EMAIL PROTECTED]
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB