Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Disk space usage of HFilev1 vs HFilev2


Copy link to this message
-
Re: Disk space usage of HFilev1 vs HFilev2
Could it be the addition of the memstoreTS?  i forget if that is in v1 as
well.

Matt

On Tue, Aug 28, 2012 at 7:37 AM, Stack <[EMAIL PROTECTED]> wrote:

> On Mon, Aug 27, 2012 at 8:30 PM, anil gupta <[EMAIL PROTECTED]> wrote:
> > Hi All,
> >
> > Here are the steps i followed to load the table with HFilev1 format:
> > 1. Set the property hfile.format.version to 1.
> > 2. Updated the conf across the cluster.
> > 3. Restarted the cluster.
> > 4. Ran the bulk loader.
> >
> > Table has 34 million records and one column family.
> > Results:
> > HDFS space for one replica of table in HFilev2:39.8 GB
> > HDFS space for one replica of table in HFilev1:38.4 GB
> >
> > Ironically, as per the above results HFileV1 is taking 3.5% lesser space
> > than HFileV2 format. I also skimmed through the code and i saw references
> > to "hfile.format.version" in HFile.java class.
> >
>
> It would be interesting to know what makes up the 3.5% difference?
> More metadata on the end of the file on v2?
>
> St.Ack
>