张磊 2012-10-18, 11:30
Ramkrishna.S.Vasudevan 2012-10-18, 12:15
-Re: one RegionServer crashed and the whole cluster was blocked
Nicolas Liochon 2012-10-18, 12:55
Some stuff below:
On Thu, Oct 18, 2012 at 1:30 PM, 张磊 <[EMAIL PROTECTED]> wrote:
> Hi, All
> One of the RegionServer of our company’s cluster was crashed. At this
> time, I found:
> 1. All the RegionServer stopped handling the requests from the client
> side( requestsPerSecond=0 at the master-status UI page).
> 2. It takes about 12-15 minutes to recovery.
> 3. I have set hbase.regionserver.restart.on.zk.expire to true, but it
> does not work.
> For 1, I knew the cluster began to split log and recover the data on the
> crashed RegionServer, will the recovery operation block all the requests
> from the client side?
No. But it's worth checking that the region server who died was not the one
handling the .meta. region. If it's the case, it's could be an explanation
(clients do have a cache, but for first time access to a region they go to
the .meta. region first.)
> For 2, Is there any solution to reduce the recovery time?
12 minutes for a single region server crash (i.e. the datanode it still
there, the cluster is ok) seems huge.
You need to look at:
- a possible root cause: if the region server got disconnected, it may be
because the network or ZooKeeper was in the bad shape anyway. So the
recovery is slow because the cause of the crash is still there.
- how is your cluster? Do you have a a lot of regions to recover? Did you
have a lot of writes on this region server?
> For 3, I checked the log, found “session is timeout” exception, maybe
> for full gc and the session was timeout. But why the
> hbase.regionserver.restart.on.zk.expire does not work? My HBase version is
I'm not sure it's still in the code base. To be checked. As well, you can
have a root cause that makes the server stops.
But there are two sides of a ZK disconnect anyway:
1) the region server: if it's disconnected but actually still there so it
may decide to kill itself, or not.
2) the cluster: after the timeout, the timeouted regionserver is considered
as dead and the recovery starts. This whatever what happens in 1). So
whatever happens in 1) does not change much from a mttr point of view,
except if your cluster is small, or if your loosing multiple nodes.
There is an autorestart option in the 0.96 scripts. It changes nothing to
the mttr itself, but cover more cases of regionserver crashes. See releases
notes in HBASE-5939.