Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Disk space usage of HFilev1 vs HFilev2

Copy link to this message
Re: Disk space usage of HFilev1 vs HFilev2
Hi Zahoor,

I mean the HDFS space taken by one replica of table in HBase0.90 was 90 GB
however hdfs disk space taken for the same table in HBase0.92 is 45GB. So,
i am interested in knowing how HFilev2 takes around 50% less hdfs space. No
compression was enabled for these tables, no schema changes and same
data-set is used .

Actually, i have to provide estimates for Hardware of HBase cluster and
difference of 50% disk usage between HFilev1 and HFilev2 makes a big
difference in my estimates. So, i am just trying to make sure that if we
use HFilev2 then less disk space will be required.


On Tue, Aug 14, 2012 at 11:50 AM, jmozah <[EMAIL PROTECTED]> wrote:

> Hi
> I am not very sure about the storage savings you are talking about, But
> there is definitely savings in RAM as there is block level index and bloom
> filter  instead of file level. More here
> http://www.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/
> http://hbase.apache.org/book.html#d540e10937
> Was compression enabled in 0.90? is it enabled now in 0.92?
> ./zahoor
> On 14-Aug-2012, at 11:45 PM, anil gupta <[EMAIL PROTECTED]> wrote:
> > Hi All,
> >
> > I recently updated my cluster from HBase 0.90 to HBase 0.92. One replica
> of
> > one table used to take 90 GB in 0.90 but the same table takes 45 GB in
> > 0.92(HFilev2). The table has 1 column family and each row stores data of
> > 300-400 bytes(this is the size of values) in 20-30 column.
> > I am interested in knowing of any disk usage optimization done in
> HFilev2?
> > Please share if you know of any relevant document to understand the
> > reduction in disk space usage?
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
Thanks & Regards,
Anil Gupta