Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> gc pause killing regionserver


Copy link to this message
-
Re: gc pause killing regionserver
Which is the size of your region in your regionservers?
Consider to read the posts at Cloudera's blog from Todd Lipcon
talking about this topic:

Avoiding Full GCs in HBase with MemStore-Local Allocation Buffers: Part 1
http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/

Avoiding Full GCs in HBase with MemStore-Local Allocation Buffers: Part 2
http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-2/

Other things to consider is look the advices of Ryan Rawson , Sr.
Developer at StumbleUpon and HBase commiter
* Monitor, Monitor, Monitor, Ganglia can help you on this
* HDFS: set xciever limit to 2048, Xmx2000m. This avoid a lot of
troubles with HDFS even under heavy loads.
* Give to HBase enough RAM
* For row import, a randomized key insert order gives substantial
speedup (tested at StumbleUpon with 9 billions of rows)

Look here too:
http://www.cloudera.com/blog/2011/04/hbase-dos-and-donts/
http://www.meetup.com/LA-HUG/pages/Video_from_April_13th_HBASE_DO%27S_and_DON%27TS/

Best wishes

On 03/20/2012 06:19 AM, Ferdy Galema wrote:
> This morning there was a crash that led to aborting 10 regionservers out of
> the 15. After debugging the logs it seems that "OutOfMemoryError: Java heap
> space" occurred at the regionservers. This was because there was a running
> job that had a too large scanner-caching combined with retrieving full
> rows. It makes sense that regionservers cannot cope with handling several
> clients (mapred tasks) each requesting 100~200 MB buffers to be filled. The
> obvious solution is a lower scanner-caching value for jobs that retrieve
> too more data per row on average.
>
> A nice solution server-side would be to dynamically adjust the
> scanner-caching value when the responses are too large. For example, is a
> response over 100MB (configurable), then reduce the scanner-caching by half
> its size.
>
> See log below.
> 2012-03-20 07:57:40,092 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60020, responseTooLarge for: next(4438820558358059204, 1000)
> from 172.23.122.15:50218: Size: 105.0m
> 2012-03-20 07:57:53,226 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60020, responseTooLarge for: next(-7429189123174849941, 1000)
> from 172.23.122.15:50218: Size: 214.4m
> 2012-03-20 07:57:57,839 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 60020, responseTooLarge for: next(-7429189123174849941, 1000)
> from 172.23.122.15:50218: Size: 103.2m
> 2012-03-20 07:57:59,442 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 60020, responseTooLarge for: next(-7429189123174849941, 1000)
> from 172.23.122.15:50218: Size: 101.8m
> 2012-03-20 07:58:20,025 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60020, responseTooLarge for: next(9033159548564260857, 1000)
> from 172.23.122.15:50218: Size: 107.2m
> 2012-03-20 07:58:27,273 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60020, responseTooLarge for: next(9033159548564260857, 1000)
> from 172.23.122.15:50218: Size: 100.1m
> 2012-03-20 07:58:52,783 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 60020, responseTooLarge for: next(-8611621895979000997, 1000)
> from 172.23.122.15:50218: Size: 101.7m
> 2012-03-20 07:59:02,541 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 60020, responseTooLarge for: next(-511305750191148153, 1000)
> from 172.23.122.15:50218: Size: 120.9m
> 2012-03-20 07:59:25,346 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 60020, responseTooLarge for: next(1570572538285935733, 1000)
> from 172.23.122.15:50218: Size: 107.8m
> 2012-03-20 07:59:46,805 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 60020, responseTooLarge for: next(-727080724379055435, 1000)
> from 172.23.122.15:50218: Size: 102.7m
> 2012-03-20 08:00:00,138 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server

Marcos Luis Ort锟斤拷z Valmaseda (@marcosluis2186)
 Data Engineer at UCI
 http://marcosluis2186.posterous.com
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB