Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> RegionServerSnapshotManager shotdown problem


+
Richard Ding 2013-03-07, 01:04
+
Ted Yu 2013-03-07, 01:19
Copy link to this message
-
Re: RegionServerSnapshotManager shotdown problem
Richard:
If you can try out the fix from HBASE-8019, that would be great.

Meanwhile, I will run the fix through 0.94 test suite.

Cheers

On Wed, Mar 6, 2013 at 5:19 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Looks like the fix from HBASE-7779 wasn't included.
> See:
> https://issues.apache.org/jira/secure/attachment/12568663/7779-v2.txt
>
> I have created HBASE-8019 for this issue.
>
> Thanks for reporting.
>
>
> On Wed, Mar 6, 2013 at 5:04 PM, Richard Ding <[EMAIL PROTECTED]> wrote:
>
>> While trying the snapshot code in HBase 0.94 branch (should be the same as
>> 0.94.6RC0), we encountered the problem that HBase region servers take long
>> time to shutdown (see the log below). This problem, however, doesn't exist
>> in 0.94.5. It looks like in RegionServerSnapshotManager.stop() method, the
>> ZK session is closed. This results in SessionExpiredException when
>> HRegionServer tries to delete MyEphemeralNode.
>> ... ...
>> 2013-03-06 11:53:19,767 INFO org.apache.hadoop.hbase.util.RetryCounter:
>> Sleeping 256000ms before retry #8...
>> 2013-03-06 11:57:35,806 WARN
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
>> ZooKeeper exception:
>> org.apache.zookeeper.KeeperException$SessionExpiredException:
>> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
>> ,60020,1362529262252
>> 2013-03-06 11:57:35,806 INFO org.apache.hadoop.hbase.util.RetryCounter:
>> Sleeping 512000ms before retry #9...
>> 2013-03-06 12:06:07,882 WARN
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
>> ZooKeeper exception:
>> org.apache.zookeeper.KeeperException$SessionExpiredException:
>> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
>> ,60020,1362529262252
>> 2013-03-06 12:06:07,882 INFO org.apache.hadoop.hbase.util.RetryCounter:
>> Sleeping 1024000ms before retry #10...
>> 2013-03-06 12:23:12,034 WARN
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient
>> ZooKeeper exception:
>> org.apache.zookeeper.KeeperException$SessionExpiredException:
>> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
>> ,60020,1362529262252
>> 2013-03-06 12:23:12,034 ERROR
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper delete
>> failed after 10 retries
>> 2013-03-06 12:23:12,034 WARN
>> org.apache.hadoop.hbase.regionserver.HRegionServer: Failed deleting my
>> ephemeral node
>> org.apache.zookeeper.KeeperException$SessionExpiredException:
>> KeeperErrorCode = Session expired for /hbase/rs/hdtest010.svl.ibm.com
>> ,60020,1362529262252
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
>> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
>> at
>>
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:133)
>> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:999)
>> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:988)
>> at
>>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1097)
>> at
>>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:875)
>> at java.lang.Thread.run(Thread.java:738)
>> 2013-03-06 12:23:12,036 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
>> hdtest010.svl.ibm.com,60020,1362529262252; zookeeper connection closed.
>> 2013-03-06 12:23:12,036 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver60020
>> exiting
>> 2013-03-06 12:23:12,039 INFO
>> org.apache.hadoop.hbase.regionserver.ShutdownHook: Shutdown hook starting;
>> hbase.shutdown.hook=true; fsShutdownHook=Thread[Thread-12,5,main]
>> 2013-03-06 12:23:12,039 INFO
>> org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Shutdown hook
>> 2013-03-06 12:23:12,039 INFO
>> org.apache.hadoop.hbase.regionserver.ShutdownHook: Starting fs shutdown
>> hook thread.
+
Richard Ding 2013-03-07, 01:47
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB