Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Re: long garbage collecting pause


Copy link to this message
-
Re: long garbage collecting pause

El 02/10/2012 11:32, Greg Ross escribi�:
> Thanks for the suggestions.
>
> I was attempting to tune the GC via mapred.child.java.opts in the job's
> Oozie config instead of in hbase-env.sh. I think this is why my efforts
> were to no avail. It was likely having no effect on the read/write
> performance. Is there any way of specifying job-specific HBase parameters
> instead of globally setting them in hbase-env.sh?
>
> The cluster has 175 nodes. Each with 48GB of RAM. The overall data input
> size is 7TB and I pre-split the table into initially 30 regions, then 100
> in another attempt. Each job runs upon 700GB chunks of the data. I used
> RegionSplitter to create and condition the table and therefore there's
> currently no compression. I'm thinking to recreate the table and 'alter' it
> with LZO compression before attempting the jobs again.
There are many points that you can do for HBase performance tuning.
In the Lars George�s book "HBase: The Definitive Guide", the Chapter 11
is dedicated
to this tricky topic, and in the HBase book, there are good points too:

http://hbase.apache.org/book.html#perf.reading

Thanks to Doug for the link.
>
> Cheers.
>
> Greg
>
>
>
> On Tue, Oct 2, 2012 at 7:20 AM, Damien Hardy <[EMAIL PROTECTED]> wrote:
>
>> Hello
>>
>> 2012/10/2 Marcos Ortiz <[EMAIL PROTECTED]>
>>
>>> Another thing that I�m seeing is that one of your main process is
>>> compaction,
>>> so you can optimize all this inceasing the size of your regions (by
>>> defaulf the size of a
>>> region is 256 MB), but you will have in your hands a "split/compaction
>>> storm" like
>>> Lars called them on his book.
>>
>> Actually it seams like the default value for hbase.hregion.max.filesize in
>> 0.92 was increased up to 1Go.
>> http://hbase.apache.org/book/upgrade0.92.html#d2051e266
>>
>> But you can set it to more (max is 20Go) and split manually.
>> http://hbase.apache.org/book/important_configurations.html#bigger.regions
>>
>> Cheers,
>>
>> --
>> Dam
>>
>
>

--
Marcos Ortiz Valmaseda,
Data Engineer && Senior System Administrator at UCI
Blog: http://marcosluis2186.posterous.com
Linkedin: http://www.linkedin.com/in/marcosluis2186
Twitter: @marcosluis2186
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci