Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Slow log splitting (Hbase 0.94.6)


+
Eran Kutner 2013-06-18, 12:11
+
Ted Yu 2013-06-18, 12:15
Copy link to this message
-
Re: Slow log splitting (Hbase 0.94.6)
Eran Kutner 2013-06-18, 16:43
Sorry, forgot to mention that, we're using CDH4.3 so Hadoop 2.0.0

I'm not sure exactly what to look for in the namenode logs but grepping for
"lease" only produced a handful of results, all seem benign.
There were two occurrences of this:
2013-06-17 09:58:40,709 INFO
org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease.  Holder:
HDFS_NameNode, pendingcreates: 7] has expired hard limit
2013-06-17 09:58:40,709 WARN BlockStateChange: BLOCK*
BlockInfoUnderConstruction.initLeaseRecovery: No blocks found, lease
removed.
2013-06-17 09:58:40,709 WARN BlockStateChange: BLOCK*
BlockInfoUnderConstruction.initLeaseRecovery: No blocks found, lease
removed.
2013-06-17 09:58:40,710 WARN BlockStateChange: BLOCK*
BlockInfoUnderConstruction.initLeaseRecovery: No blocks found, lease
removed.
2013-06-17 09:58:40,710 WARN BlockStateChange: BLOCK*
BlockInfoUnderConstruction.initLeaseRecovery: No blocks found, lease
removed.
2013-06-17 09:58:40,710 WARN BlockStateChange: BLOCK*
BlockInfoUnderConstruction.initLeaseRecovery: No blocks found, lease
removed.
2013-06-17 09:58:40,710 WARN BlockStateChange: BLOCK*
BlockInfoUnderConstruction.initLeaseRecovery: No blocks found, lease
removed.
2013-06-17 09:58:40,710 WARN BlockStateChange: BLOCK*
BlockInfoUnderConstruction.initLeaseRecovery: No blocks found, lease
removed.

and one of this:
2013-06-17 10:04:25,012 INFO
org.apache.hadoop.hdfs.server.namenode.LeaseManager: [Lease.  Holder:
DFSClient_NONMAPREDUCE_1958633016_62, pendingcreates: 1] has expired hard
limit
2013-06-17 10:04:25,012 WARN org.apache.hadoop.hdfs.StateChange: BLOCK*
internalReleaseLease: All existing blocks are COMPLETE, lease removed, file
closed.
However, I did notice quite a lot of these errors (almost a million of
them):
INFO org.apache.hadoop.security.JniBasedUnixGroupsMapping: Error getting
groups for hbase: No entry for user

I don't know why is the NN even trying to resolve the user security groups
since I have dfs.permissions set to false. Can this be the cause of the
problem?
-eran
On Tue, Jun 18, 2013 at 3:15 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> What Hadoop version are you using ?
>
> Can you check NameNode log to see if lease recovery took long time ?
>
> Cheers
>
> On Jun 18, 2013, at 5:11 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> > We had a brute force cluster shutdown event that was followed by log
> > recovery when the cluster went back online.
> > The cluster took hours to split the logs and recover the regions, all of
> > which might have made sense since we have quite a lot of regions (around
> > 13K) but the weird thing is that there was no obvious bottleneck during
> the
> > recovery process. CPU was almost idle on all the nodes, IO was on 5-20%
> > utilization, memory was OK, network wasn't overloaded, but still it was
> > slow.
> > Any idea what can be slowing it down?
> >
> > Thanks.
> >
> > -eran
>