Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)


Copy link to this message
-
答复: 答复: HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)
Just curious, what's your hadoop version, Vladimir ?
At least on hadoop2.0+, the default ReplcaceDatanode policy should be expected pick another dn up to setupPipeline, then if you have only 1 dn broken, it should be expected still could write into 3 nodes successful, and then the HBase's "hbase.regionserver.hlog.tolerable.lowreplication" checking will not jump out:)

Thanks,
________________________________________
发件人: Vladimir Rodionov [[EMAIL PROTECTED]]
发送时间: 2014年1月16日 14:45
收件人: [EMAIL PROTECTED]
抄送: lars hofhansl
主题: Re: 答复: HBase 0.94.15: writes stalls periodically even under moderate steady load (AWS EC2)

This what I found in a RS Log:
2014-01-16 01:22:18,256 ResponseProcessor for block
blk_5619307008368309102_2603 WARN  [DFSClient] DFSOutputStream
ResponseProcessor exception  for block
blk_5619307008368309102_2603java.io.IOException: Bad response 1 for block
blk_5619307008368309102_2603 from datanode 10.38.106.234:50010
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2977)

2014-01-16 01:22:18,258 DataStreamer for file
/hbase/.logs/ip-10-10-25-199.ec2.internal,60020,1389843986689/ip-10-10-25-199.ec2.internal%2C60020%2C1389843986689.1389853200626
WARN  [DFSClient] Error Recovery for block blk_5619307008368309102_2603 bad
datanode[2] 10.38.106.234:50010
2014-01-16 01:22:18,258 DataStreamer for file
/hbase/.logs/ip-10-10-25-199.ec2.internal,60020,1389843986689/ip-10-10-25-199.ec2.internal%2C60020%2C1389843986689.1389853200626
WARN  [DFSClient] Error Recovery for block blk_5619307008368309102_2603 in
pipeline 10.10.25.199:50010, 10.40.249.135:50010, 10.38.106.234:50010: bad
datanode 10.38.106.234:50010
2014-01-16 01:22:22,800 IPC Server handler 10 on 60020 WARN  [HLog] HDFS
pipeline error detected. Found 2 replicas but expecting no less than 3
replicas.  Requesting close of hlog.
2014-01-16 01:22:22,806 IPC Server handler 2 on 60020 WARN  [HLog] HDFS
pipeline error detected. Found 2 replicas but expecting no less than 3
replicas.  Requesting close of hlog.
2014-01-16 01:22:22,808 IPC Server handler 28 on 60020 WARN  [HLog] HDFS
pipeline error detected. Found 2 replicas but expecting no less than 3
replicas.  Requesting close of hlog.
2014-01-16 01:22:22,808 IPC Server handler 13 on 60020 WARN  [HLog] HDFS
pipeline error detected. Found 2 replicas but expecting no less than 3
replicas.  Requesting close of hlog.
2014-01-16 01:22:22,808 IPC Server handler 27 on 60020 WARN  [HLog] HDFS
pipeline error detected. Found 2 replicas but expecting no less than 3
replicas.  Requesting close of hlog.
2014-01-16 01:22:22,811 IPC Server handler 22 on 60020 WARN  [HLog] Too
many consecutive RollWriter requests, it's a sign of the total number of
live datanodes is lower than the tolerable replicas.
2014-01-16 01:22:22,911 IPC Server handler 8 on 60020 INFO  [HLog]
LowReplication-Roller was enabled.
2014-01-16 01:22:22,930 regionserver60020.cacheFlusher INFO  [HRegion]
Finished memstore flush of ~128.3m/134538640, currentsize=3.0m/3113200 for
region usertable,,1389844429593.d4843a72f02a7396244930162fbecd06. in
68096ms, sequenceid=108753, compaction requested=false
2014-01-16 01:22:22,930 regionserver60020.logRoller INFO  [FSUtils]
FileSystem doesn't support getDefaultReplication
2014-01-16 01:22:22,930 regionserver60020.logRoller INFO  [FSUtils]
FileSystem doesn't support getDefaultBlockSize
2014-01-16 01:22:23,027 regionserver60020.logRoller INFO  [HLog] Roll
/hbase/.logs/ip-10-10-25-199.ec2.internal,60020,1389843986689/ip-10-10-25-199.ec2.internal%2C60020%2C1389843986689.1389853200626,
entries=1012, filesize=140440002.  for
/hbase/.logs/ip-10-10-25-199.ec2.internal,60020,1389843986689/ip-10-10-25-199.ec2.internal%2C60020%2C1389843986689.1389853342930
2014-01-16 01:22:23,194 IPC Server handler 23 on 60020 WARN  [HBaseServer]
(responseTooSlow):
{"processingtimems":68410,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@51ff528e),
rpc version=1, client version=29, methodsFingerPrint=-540141542","client":"
10.38.163.32:51727
","starttimems":1389853274560,"queuetimems":0,"class":"HRegionServer","responsesize":0,"method":"multi"}
2014-01-16 01:22:23,401 IPC Server handler 13 on 60020 WARN  [HBaseServer]
(responseTooSlow):
{"processingtimems":68813,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@4e136610),
rpc version=1, client version=29, methodsFingerPrint=-540141542","client":"
10.38.163.32:51727
","starttimems":1389853274586,"queuetimems":0,"class":"HRegionServer","responsesize":0,"method":"multi"}
2014-01-16 01:22:23,609 IPC Server handler 1 on 60020 WARN  [HBaseServer]
(responseTooSlow):
{"processingtimems":69002,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@51390a8),
rpc version=1, client version=29, methodsFingerPrint=-540141542","client":"
10.38.163.32:51727
","starttimems":1389853274604,"queuetimems":1,"class":"HRegionServer","responsesize":0,"method":"multi"}
2014-01-16 01:22:23,629 IPC Server handler 20 on 60020 WARN  [HBaseServer]
(responseTooSlow):
{"processingtimems":68991,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@5f125a0f),
rpc version=1, client version=29, methodsFingerPrint=-540141542","client":"
10.38.163.32:51727
","starttimems":1389853274635,"queuetimems":1,"class":"HRegionServer","responsesize":0,"method":"multi"}
2014-01-16 01:22:23,656 IPC Server handler 27 on 60020 WARN  [HBaseServer]
(responseTooSlow):
{"processingtimems":68835,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@2dd6bf8c),
rpc version=1, client version=29, methodsFingerPrint=-540141542","client":"
10.38.163.32:51727
","starttimems":1389853274818,"queuetimems":1,"class":"HRegionServer","responsesize":0,"method":"multi"}
2014-01-16 01:22:23,657 IPC Server handler 19 on 60020 WARN  [HBaseServer]
(responseTooSlow):
{"processingtimems":68982,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@6db997d6),
rpc version=1, client version=29, methodsFingerPrint=-540141542","client":"
10.38.163.32:51727
","startti