Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Is there an additional overhead when storing data in HDFS?


Copy link to this message
-
Re: Is there an additional overhead when storing data in HDFS?
Namenode will have trivial amount of data stored in journal/fsimage.

On Tue, Nov 20, 2012 at 11:21 PM, WangRamon <[EMAIL PROTECTED]> wrote:

> Thanks, besides the checksum data is there anything else? Data in name
> node?
>
> ------------------------------
> Date: Tue, 20 Nov 2012 23:14:06 -0800
> Subject: Re: Is there an additional overhead when storing data in HDFS?
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
>
>
> HDFS uses 4GB for the file + checksum data.
>
> Default is for every 512 bytes of data, 4 bytes of checksum are stored. In
> this case additional 32MB data.
>
> On Tue, Nov 20, 2012 at 11:00 PM, WangRamon <[EMAIL PROTECTED]>wrote:
>
> Hi All
>
> I'm wondering if there is an additional overhead when storing some data
> into HDFS? For example, I have a 2GB file, the replicate factor of HDSF is
> 2, when the file is uploaded to HDFS, should HDFS use 4GB to store it or
> more then 4GB to store it? If it takes more than 4GB space, why?
>
> Thanks
> Ramon
>
>
>
>
> --
> http://hortonworks.com/download/
>
>
--
http://hortonworks.com/download/
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB