Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - RegionServerSnapshotManager shotdown problem


Copy link to this message
-
RegionServerSnapshotManager shotdown problem
Richard Ding 2013-03-07, 01:04
While trying the snapshot code in HBase 0.94 branch (should be the same as
0.94.6RC0), we encountered the problem that HBase region servers take long
time to shutdown (see the log below). This problem, however, doesn't exist
in 0.94.5. It looks like in RegionServerSnapshotManager.stop() method, the
ZK session is closed. This results in SessionExpiredException when
HRegionServer tries to delete MyEphemeralNode.
... ...
2013-03-06 11:53:19,767 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 256000ms before retry #8...
2013-03-06 11:57:35,806 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
,60020,1362529262252
2013-03-06 11:57:35,806 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 512000ms before retry #9...
2013-03-06 12:06:07,882 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
,60020,1362529262252
2013-03-06 12:06:07,882 INFO org.apache.hadoop.hbase.util.RetryCounter:
Sleeping 1024000ms before retry #10...
2013-03-06 12:23:12,034 WARN
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
ZooKeeper exception:
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
,60020,1362529262252
2013-03-06 12:23:12,034 ERROR
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper delete
failed after 10 retries
2013-03-06 12:23:12,034 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Failed deleting my
ephemeral node
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
,60020,1362529262252
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:133)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:999)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:988)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1097)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:875)
at java.lang.Thread.run(Thread.java:738)
2013-03-06 12:23:12,036 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
hdtest010.svl.ibm.com,60020,1362529262252; zookeeper connection closed.
2013-03-06 12:23:12,036 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020
exiting
2013-03-06 12:23:12,039 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook starting;
hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-12,5,main]
2013-03-06 12:23:12,039 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown hook
2013-03-06 12:23:12,039 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown
hook thread.
2013-03-06 12:23:12,042 INFO
org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook finished.