I'll start with clearly stating that I'm not a gc specialist. I spend a
bunch of time with it but forget all the things I learn once I solve my
What exactly is the problem here? Does the server become unresponsive
after 16 hours? What happens in the HBase logs for that regionserver? I
believe that you're seeing frequent runs likely because of fragmentation of
your heap along with your XX:CMSInitiatingOccupancyFraction of 60%. These
would be a precursor to a full gc which would likely actually take the
A few quick thoughts that you may or may not have run across:
- MSLAB is your friend if you haven't been using it already. See more
- I can't remember exactly but I feel like the number that used to be
quoted by some was 10 seconds per gb for a full gc. So you're looking at a
full gc of over ~4 minutes with that size heap once you do arrive at a full
- If you're okay having unresponsive regions for 4+minutes, you'd also
want to increase your ZooKeeper timeout to allow for it.
- If I remember correctly, at a recent presentation yfrog was utilizing
heaps as high as 64gb but that most people thought that was very risky and
you should run much lower. The 16gb that Doug quotes is more what people
seemed to use.
- I haven't heard about most people setting GC threads specifically.
Since you set the gc threads at 6, I assume you have at least 6 true cores?
We used to run our regionservers up around 24gb but had constant problems.
Ultimately, we settled down at 12gb heaps with mslab enabled (and at 4mb
as opposed to the traditional 2mb default due to our cell sizes).
Also, off heap block cache is coming up in the 0.92 release (
https://issues.apache.org/jira/browse/HBASE-4027). That should
theoretically allow you to use a bunch more memory for the block cache
without the same problems. Others who are more familiar with the feature
would be able to speak better to real world results...
On Tue, Dec 6, 2011 at 5:30 PM, Derek Wollenstein <[EMAIL PROTECTED]> wrote:
> I will take a look at lowering this; Unfortunately I'm inheriting existing
> settings and trying to be as conservative as possible when making changes.
> I can definitely try lowering the memory -- I've gotten mixed messages on
> how much to allocate to the HBase heap. I'll start taking a look at moving
> both of these settings down and see how it affects performance (and trying
> to use https://github.com/brianfrankcooper/YCSB/wiki for testing). Thanks
> for the suggestion.
> On Tue, Dec 6, 2011 at 5:20 PM, Doug Meil <[EMAIL PROTECTED]
> > There are others that know this GC issues better than myself, but setting
> > hfile.block.cache.size to .5 seems a bit aggressive to me. That's 50% of
> > the heap right there.
> > Also, the issue with setting the max-heap to 24Gb is that whenever a full
> > GC is required on a heap that size, it's a killer. Folks at recent Hbase
> > hackathons were talking about not going higher than 16Gb for RS.
> > On 12/6/11 8:10 PM, "Derek Wollenstein" <[EMAIL PROTECTED]> wrote:
> > >I've been working on improving GC settings for HBase RegionServer
> > >instances, and I seem to have run into a bit of a dead end.
> > >
> > >Basically I've been trying to tune GC settings and memory settings
> > >appropriately, but I keep on having my server reach something like GC
> > >Death.
> > >
> > >My current memory settings are
> > >HEAPSIZE=24576
> > >-ea -Xmx24576 -Xms24576 -XX:+UseConcMarkSweepGC -XX:+UseCompressedOops
> > >-XX:NewSize=192m -XX:MaxNewSize=192m
> > >-XX:CMSInitiatingOccupancyFraction=60 -verbose:gc -XX:+PrintGCDetails
> > >-XX:+PrintGCDateStamps -Xloggc:$HBASE_HOME/logs/gc-hbase.log
> > >-XX:ParallelGCThreads=6
> > >
> > >We've also set hfile.block.cache.size to 0.5 ( believing that incresing