Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> region server down when scanning using mapreduce


Copy link to this message
-
region server down when scanning using mapreduce
Hi,

When we use mapreduce to dump data from a pretty large table on hbase. One region server crash and then another. Mapreduce is deployed together with hbase.

1) From log of the region server, there are both "next" and "multi" operations on going. Is it because there is write/read conflict that cause scanner timeout?
2) Region server has 24 cores, and # max map tasks is 24 too; the table has about 30 regions (each of size 0.5G) on the region server, is it because cpu is all used by mapreduce and that case region server slow and then timeout?
2) current hbase.regionserver.handler.count is 10 by default, should it be enlarged?

Please give us some advices.

Thanks,
Wei
Log information:
[Regionserver rs21:]

2013-03-11 18:36:28,148 INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/adcbg21.machine.wisdom.com,60020,1363010589837/rs21%2C60020%2C1363010589837.1363025554488, entries=22417, filesize=127539793.  for /hbase/.logs/rs21,60020,1363010589837/rs21%2C60020%2C1363010589837.1363026988052
2013-03-11 18:37:39,481 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 28183ms instead of 3000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
2013-03-11 18:37:40,163 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":29830,"call":"next(1656517918313948447, 1000), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.127.21:56058","starttimems":1363027030280,"queuetimems":4602,"class":"HRegionServer","responsesize":2774484,"method":"next"}
2013-03-11 18:37:40,163 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":31195,"call":"next(-8353194140406556404, 1000), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.127.21:56529","starttimems":1363027028804,"queuetimems":3634,"class":"HRegionServer","responsesize":2270919,"method":"next"}
2013-03-11 18:37:40,163 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":30965,"call":"next(2623756537510669130, 1000), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.127.21:56146","starttimems":1363027028807,"queuetimems":3484,"class":"HRegionServer","responsesize":2753299,"method":"next"}
2013-03-11 18:37:40,236 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":31023,"call":"next(5293572780165196795, 1000), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.127.21:56069","starttimems":1363027029086,"queuetimems":3589,"class":"HRegionServer","responsesize":2722543,"method":"next"}
2013-03-11 18:37:40,368 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":31160,"call":"next(-4285417329791344278, 1000), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.127.21:56586","starttimems":1363027029204,"queuetimems":3707,"class":"HRegionServer","responsesize":2938870,"method":"next"}
2013-03-11 18:37:43,652 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":31249,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@2d19985a), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.109.21:35342","starttimems":1363027031505,"queuetimems":5720,"class":"HRegionServer","responsesize":0,"method":"multi"}
2013-03-11 18:37:49,108 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":38813,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@19c59a2e), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.125.11:57078","starttimems":1363027030273,"queuetimems":4663,"class":"HRegionServer","responsesize":0,"method":"multi"}
2013-03-11 18:37:50,410 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":38893,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@40022ddb), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.109.20:51698","starttimems":1363027031505,"queuetimems":5720,"class":"HRegionServer","responsesize":0,"method":"multi"}
2013-03-11 18:37:50,642 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":40037,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@6b8bc8cf), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.125.11:57078","starttimems":1363027030601,"queuetimems":4818,"class":"HRegionServer","responsesize":0,"method":"multi"}
2013-03-11 18:37:51,529 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":10880,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@6928d7b), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.125.11:57076","starttimems":1363027060645,"queuetimems":34763,"class":"HRegionServer","responsesize":0,"method":"multi"}
2013-03-11 18:37:51,776 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":41327,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@354baf25), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.125.11:57076","starttimems":1363027030411,"queuetimems":4680,"class":"HRegionServer","responsesize":0,"method":"multi"}
2013-03-11 18:38:32,361 WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow): {"processingtimems":10204,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@6d86b477), rpc version=1, client version=29, methodsFingerPrint=54742778","client":"10.20.125.10:36950","starttimems":1363027102044,"queuetimems":11027,"class":"HRegionServer","responsesize":0,"method":"multi"}

[master:]
2013-03-11 18:35:39,386 WARN org.apache.hadoop.conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
2013-03-11 18:38:25,892 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing because balanced cluster; servers=10 regions=477 average=47.7 mostloaded=52 leastloade
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB