Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Performance test results


Copy link to this message
-
Re: Performance test results
J-D,
I'll try what you suggest but it is worth pointing out that my data set has
over 300M rows, however in my read test I am random reading out of a subset
that contains only 0.5M rows (5000 rows in each of the 100 key ranges in the
table).

-eran

On Tue, May 3, 2011 at 23:29, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:

> On Tue, May 3, 2011 at 6:20 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> > Flushing, at least when I try it now, long after I stopped writing,
> doesn't
> > seem to have any effect.
>
> Bummer.
>
> >
> > In my log I see this:
> > 2011-05-03 08:57:55,384 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=3.39 GB,
> > free=897.87 MB, max=4.27 GB, blocks=54637, accesses=89411811,
> hits=75769916,
> > hitRatio=84.74%%, cachingAccesses=83656318, cachingHits=75714473,
> > cachingHitsRatio=90.50%%, evictions=1135, evicted=7887205,
> > evictedPerRun=6949.0791015625
> >
> > and every 30 seconds or so something like this:
> > 2011-05-03 08:58:07,900 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > started; Attempting to free 436.92 MB of total=3.63 GB
> > 2011-05-03 08:58:07,947 DEBUG
> > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
> > completed; freed=436.95 MB, total=3.2 GB, single=931.65 MB, multi=2.68
> GB,
> > memory=3.69 KB
> >
> > Now, if the entire working set I'm reading is 100MB in size, why would it
> > have to evict 436MB just to get it filled back in 30 seconds?
>
> I was about to ask the same question... from what I can tell from the
> this log, it seems that your working dataset is much larger than 3GB
> (the fact that it's evicting means it could be a lot more) and that's
> only on that region server.
>
> First reason that comes in mind on why it would be so much bigger is
> that you would have uploaded your dataset more than once and since
> HBase keeps versions of the data, it could accumulate. That doesn't
> explain how it would grow into GBs since by default a family only
> keeps 3 versions... unless you set that higher than the default or you
> uploaded the same data tens of times within 24 hours and the major
> compactions didn't kick in.
>
> In any case, it would be interesting that you:
>
>  - truncate the table
>  - re-import the data
>  - force a flush
>  - wait a bit until the flushes are done (should take 2-3 seconds if
> your dataset is really 100MB)
>  - do a "hadoop dfs -dus" on the table's directory (should be under/hbase)
>  - if the number is way out of whack, review how you are inserting
> your data. Either way, please report back.
>
> >
> > Also, what is a good value for hfile.block.cache.size (I have it now on
> .35)
> > but with 12.5GB of RAM available for the region servers it seem I should
> be
> > able to get it much higher.
>
> Depends, you also have to account for the MemStores which by default
> can use up to 40% of the heap
> (hbase.regionserver.global.memstore.upperLimit) leaving currently for
> you only 100-40-35=25% of the heap to do stuff like serving requests,
> compacting, flushing, etc. It's hard to give a good number for what
> should be left to the rest of HBase tho...
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB