Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> block size


HDFS blocks are stored as files in the underlying filesystem of your
datanodes. Those files do not take a fixed amount of space, so if you
store 10 MB in a file and you have 128 MB blocks, you still only use
10 MB (times 3 with default replication).

However, the namenode does incur additional overhead by having to
track a larger number of small files. So, if you can merge files, it's
best practice to do so.

-Joey

On Tue, Sep 20, 2011 at 9:54 PM, hao.wang <[EMAIL PROTECTED]> wrote:
> Hi All:
>   I have lots of small files stored in HDFS. My HDFS block size is 128M. Each file is significantly smaller than the HDFS block size.  Then, I want to know whether the small file used 128M in HDFS?
>
> regards
> 2011-09-21
>
>
>
> hao.wang
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB