Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> RegionServerSnapshotManager shotdown problem


Copy link to this message
-
Re: RegionServerSnapshotManager shotdown problem
Thanks Ted for the quick solution.
On Wed, Mar 6, 2013 at 5:25 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Richard:
> If you can try out the fix from HBASE-8019, that would be great.
>
> Meanwhile, I will run the fix through 0.94 test suite.
>
> Cheers
>
> On Wed, Mar 6, 2013 at 5:19 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > Looks like the fix from HBASE-7779 wasn't included.
> > See:
> > https://issues.apache.org/jira/secure/attachment/12568663/7779-v2.txt
> >
> > I have created HBASE-8019 for this issue.
> >
> > Thanks for reporting.
> >
> >
> > On Wed, Mar 6, 2013 at 5:04 PM, Richard Ding <[EMAIL PROTECTED]> wrote:
> >
> >> While trying the snapshot code in HBase 0.94 branch (should be the same
> as
> >> 0.94.6RC0), we encountered the problem that HBase region servers take
> long
> >> time to shutdown (see the log below). This problem, however, doesn't
> exist
> >> in 0.94.5. It looks like in RegionServerSnapshotManager.stop() method,
> the
> >> ZK session is closed. This results in SessionExpiredException when
> >> HRegionServer tries to delete MyEphemeralNode.
> >> ... ...
> >> 2013-03-06 11:53:19,767 INFO org.apache.hadoop.hbase.util.RetryCounter:
> >> Sleeping 256000ms before retry #8...
> >> 2013-03-06 11:57:35,806 WARN
> >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly
> transient
> >> ZooKeeper exception:
> >> org.apache.zookeeper.KeeperException$SessionExpiredException:
> >> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
> >> ,60020,1362529262252
> >> 2013-03-06 11:57:35,806 INFO org.apache.hadoop.hbase.util.RetryCounter:
> >> Sleeping 512000ms before retry #9...
> >> 2013-03-06 12:06:07,882 WARN
> >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly
> transient
> >> ZooKeeper exception:
> >> org.apache.zookeeper.KeeperException$SessionExpiredException:
> >> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
> >> ,60020,1362529262252
> >> 2013-03-06 12:06:07,882 INFO org.apache.hadoop.hbase.util.RetryCounter:
> >> Sleeping 1024000ms before retry #10...
> >> 2013-03-06 12:23:12,034 WARN
> >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly
> transient
> >> ZooKeeper exception:
> >> org.apache.zookeeper.KeeperException$SessionExpiredException:
> >> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
> >> ,60020,1362529262252
> >> 2013-03-06 12:23:12,034 ERROR
> >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper delete
> >> failed after 10 retries
> >> 2013-03-06 12:23:12,034 WARN
> >> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed deleting my
> >> ephemeral node
> >> org.apache.zookeeper.KeeperException$SessionExpiredException:
> >> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
> >> ,60020,1362529262252
> >> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> >> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >> at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
> >> at
> >>
> >>
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:133)
> >> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:999)
> >> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:988)
> >> at
> >>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1097)
> >> at
> >>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:875)
> >> at java.lang.Thread.run(Thread.java:738)
> >> 2013-03-06 12:23:12,036 INFO
> >> org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
> >> hdtest010.svl.ibm.com,60020,1362529262252; zookeeper connection closed.
> >> 2013-03-06 12:23:12,036 INFO
> >> org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020
> >> exiting
> >> 2013-03-06 12:23:12,039 INFO
> >> org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook
> starting;
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB