Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: long garbage collecting pause


+
Marcos Ortiz 2012-10-02, 13:38
+
Damien Hardy 2012-10-02, 14:20
+
Greg Ross 2012-10-02, 15:32
Copy link to this message
-
Re: long garbage collecting pause

El 02/10/2012 11:32, Greg Ross escribi�:
> Thanks for the suggestions.
>
> I was attempting to tune the GC via mapred.child.java.opts in the job's
> Oozie config instead of in hbase-env.sh. I think this is why my efforts
> were to no avail. It was likely having no effect on the read/write
> performance. Is there any way of specifying job-specific HBase parameters
> instead of globally setting them in hbase-env.sh?
>
> The cluster has 175 nodes. Each with 48GB of RAM. The overall data input
> size is 7TB and I pre-split the table into initially 30 regions, then 100
> in another attempt. Each job runs upon 700GB chunks of the data. I used
> RegionSplitter to create and condition the table and therefore there's
> currently no compression. I'm thinking to recreate the table and 'alter' it
> with LZO compression before attempting the jobs again.
There are many points that you can do for HBase performance tuning.
In the Lars George�s book "HBase: The Definitive Guide", the Chapter 11
is dedicated
to this tricky topic, and in the HBase book, there are good points too:

http://hbase.apache.org/book.html#perf.reading

Thanks to Doug for the link.
>
> Cheers.
>
> Greg
>
>
>
> On Tue, Oct 2, 2012 at 7:20 AM, Damien Hardy <[EMAIL PROTECTED]> wrote:
>
>> Hello
>>
>> 2012/10/2 Marcos Ortiz <[EMAIL PROTECTED]>
>>
>>> Another thing that I�m seeing is that one of your main process is
>>> compaction,
>>> so you can optimize all this inceasing the size of your regions (by
>>> defaulf the size of a
>>> region is 256 MB), but you will have in your hands a "split/compaction
>>> storm" like
>>> Lars called them on his book.
>>
>> Actually it seams like the default value for hbase.hregion.max.filesize in
>> 0.92 was increased up to 1Go.
>> http://hbase.apache.org/book/upgrade0.92.html#d2051e266
>>
>> But you can set it to more (max is 20Go) and split manually.
>> http://hbase.apache.org/book/important_configurations.html#bigger.regions
>>
>> Cheers,
>>
>> --
>> Dam
>>
>
>

--
Marcos Ortiz Valmaseda,
Data Engineer && Senior System Administrator at UCI
Blog: http://marcosluis2186.posterous.com
Linkedin: http://www.linkedin.com/in/marcosluis2186
Twitter: @marcosluis2186
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci
+
Michael Segel 2012-10-02, 14:23
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB