Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> repetita iuvant?

Copy link to this message
Re: repetita iuvant?
On 10/25/2012 07:44 AM, Anoop Sam John wrote:
> Hi
> Can you tell more details? How much data your scan is going to retrieve?
it's a full scan of 1.7TB of data on 62 regionserver+master and ZK
quorum machines. I hoped that in some way block caching may slightly
improve the read perfomances. hbase version 0.92.1. scan with hadoop
1.0.3 throught tableinputformat.
>   What is the time taken in each attempt ?
about 1h20'
> Can you observe the cache hit ratio?
while the blockCacheSizeMB=1649.8

>  What is the memory avail in RS?
( in hbase-env.sh: export HBASE_REGIONSERVER_OPTS="-Xmx8g -Xms8g
-Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=70" )
> .....Also the cluster details and regions
1525 regions
regions too big? I created a pre-splitted table before bulk importing. I
don't understand why the regions didn't increase afterwards.
hbase.hregion.max.filesize is the default 256MB and the regions are
roughly 1GB. How come hbase have not split'em ? but that's another