Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Disk space usage of HFilev1 vs HFilev2


Copy link to this message
-
Re: Disk space usage of HFilev1 vs HFilev2
Hi Zahoor,

I mean the HDFS space taken by one replica of table in HBase0.90 was 90 GB
however hdfs disk space taken for the same table in HBase0.92 is 45GB. So,
i am interested in knowing how HFilev2 takes around 50% less hdfs space. No
compression was enabled for these tables, no schema changes and same
data-set is used .

Actually, i have to provide estimates for Hardware of HBase cluster and
difference of 50% disk usage between HFilev1 and HFilev2 makes a big
difference in my estimates. So, i am just trying to make sure that if we
use HFilev2 then less disk space will be required.

Thanks,
Anil

On Tue, Aug 14, 2012 at 11:50 AM, jmozah <[EMAIL PROTECTED]> wrote:

> Hi
>
> I am not very sure about the storage savings you are talking about, But
> there is definitely savings in RAM as there is block level index and bloom
> filter  instead of file level. More here
>
> http://www.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/
> http://hbase.apache.org/book.html#d540e10937
>
> Was compression enabled in 0.90? is it enabled now in 0.92?
>
> ./zahoor
>
>
> On 14-Aug-2012, at 11:45 PM, anil gupta <[EMAIL PROTECTED]> wrote:
>
> > Hi All,
> >
> > I recently updated my cluster from HBase 0.90 to HBase 0.92. One replica
> of
> > one table used to take 90 GB in 0.90 but the same table takes 45 GB in
> > 0.92(HFilev2). The table has 1 column family and each row stores data of
> > 300-400 bytes(this is the size of values) in 20-30 column.
> > I am interested in knowing of any disk usage optimization done in
> HFilev2?
> > Please share if you know of any relevant document to understand the
> > reduction in disk space usage?
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
>
>
--
Thanks & Regards,
Anil Gupta
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB