Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> GC recommendations for large Region Server heaps


Copy link to this message
-
Re: GC recommendations for large Region Server heaps
Hi Suraj,

One thing I have observed is that if you very high block cache churn which
happens in a ready heavy workload - a full GC eventually happens because
more block cache blocks bleed into the old generation (LRU based caching).
I have seen this happen particularly when the read load is extremely high -
> 10K per second and hence, high block cache churn.

Varun
On Tue, Jul 9, 2013 at 6:08 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:

> Suraj,
> we have heavy read and write loads, with my GC options, we cannot avoid
> full GC, but we can decrease GC time greatly.
>
>
> On Wed, Jul 10, 2013 at 8:05 AM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>
> > Hi Azuryy:
> > Thanks so much for sharing. This gives me a good list of tuning options
> to
> > read more on while constructing our GC_OPTS.
> >
> > Follow up question: Was your cluster tuned to handle read heavy loads or
> > was it mixed / read-write loads? Just trying to understand what your
> > constraints were.
> > --Suraj
> >
> >
> > On Mon, Jul 8, 2013 at 10:52 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
> >
> > > This is my HBASE GC options of CMS, it does work well.
> > >
> > > XX:+DisableExplicitGC -XX:+UseCompressedOops -XX:PermSize=160m
> > > -XX:MaxPermSize=160m -XX:GCTimeRatio=19 -XX:SoftRefLRUPolicyMSPerMB=0
> > > -XX:SurvivorRatio=2 -XX:MaxTenuringThreshold=1
> > -XX:+UseFastAccessorMethods
> > > -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
> > > -XX:CMSInitiatingOccupancyFraction=70
> -XX:+UseCMSCompactAtFullCollection
> > > -XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSClassUnloadingEnabled
> > > -XX:CMSMaxAbortablePrecleanTime=300 -XX:+CMSScavengeBeforeRemark
> > >
> > >
> > >
> > > On Tue, Jul 9, 2013 at 1:12 PM, Otis Gospodnetic <
> > > [EMAIL PROTECTED]
> > > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Check
> > http://blog.sematext.com/2013/06/24/g1-cms-java-garbage-collector/
> > > >
> > > > Those graphs show RegionServer before and after switch to G1.  The
> > > > dashboard screenshot further below shows CMS (top row) vs. G1 (bottom
> > > > row).  After those tests we ended up switching to G1 across the whole
> > > > cluster and haven't had issues or major pauses since.... knock on
> > > > keyboard.
> > > >
> > > > Otis
> > > > --
> > > > Solr & ElasticSearch Support -- http://sematext.com/
> > > > Performance Monitoring -- http://sematext.com/spm
> > > >
> > > >
> > > >
> > > > On Mon, Jul 8, 2013 at 2:56 PM, Stack <[EMAIL PROTECTED]> wrote:
> > > > > On Mon, Jul 8, 2013 at 11:09 AM, Suraj Varma <[EMAIL PROTECTED]>
> > > > wrote:
> > > > >
> > > > >> Hello:
> > > > >> We have an HBase cluster with region servers running on 8GB heap
> > size
> > > > with
> > > > >> a 0.6 block cache (it is a read heavy cluster, with bursty write
> > > traffic
> > > > >> via MR jobs). (version: hbase-0.94.6.1)
> > > > >>
> > > > >> During HBaseCon, while speaking to a few attendees, I heard some
> > folks
> > > > were
> > > > >> running region servers as high as 24GB and some others in the 16GB
> > > > range.
> > > > >>
> > > > >> So - question: Are there any special GC recommendations (tuning
> > > > parameters,
> > > > >> flags, etc) that folks who run at these large heaps can recommend
> > > while
> > > > >> moving up from an 8GB heap? i.e. for 16GB and for 24GB RS heaps
> ...
> > ?
> > > > >>
> > > > >> I'm especially concerned about long pauses causing zk session
> > timeouts
> > > > and
> > > > >> consequent RS shutdowns. Our boxes do have a lot of RAM and we are
> > > > >> exploring how we can use more of it for the cluster while
> > maintaining
> > > > >> overall stability.
> > > > >>
> > > > >> Also - if there are clusters running multiple region servers per
> > host,
> > > > I'd
> > > > >> be very interested to know what RS heap sizes those are being run
> at
> > > ...
> > > > >> and whether this was chosen as an alternative to running a single
> RS
> > > > with
> > > > >> large heap.
> > > > >>
> > > > >> (I know I'll have to test the GC stuff out on my cluster and for