Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Disk space usage of HFilev1 vs HFilev2


Copy link to this message
-
Re: Disk space usage of HFilev1 vs HFilev2
Could it be the addition of the memstoreTS?  i forget if that is in v1 as
well.

Matt

On Tue, Aug 28, 2012 at 7:37 AM, Stack <[EMAIL PROTECTED]> wrote:

> On Mon, Aug 27, 2012 at 8:30 PM, anil gupta <[EMAIL PROTECTED]> wrote:
> > Hi All,
> >
> > Here are the steps i followed to load the table with HFilev1 format:
> > 1. Set the property hfile.format.version to 1.
> > 2. Updated the conf across the cluster.
> > 3. Restarted the cluster.
> > 4. Ran the bulk loader.
> >
> > Table has 34 million records and one column family.
> > Results:
> > HDFS space for one replica of table in HFilev2:39.8 GB
> > HDFS space for one replica of table in HFilev1:38.4 GB
> >
> > Ironically, as per the above results HFileV1 is taking 3.5% lesser space
> > than HFileV2 format. I also skimmed through the code and i saw references
> > to "hfile.format.version" in HFile.java class.
> >
>
> It would be interesting to know what makes up the 3.5% difference?
> More metadata on the end of the file on v2?
>
> St.Ack
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB