Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> block size

HDFS blocks are stored as files in the underlying filesystem of your
datanodes. Those files do not take a fixed amount of space, so if you
store 10 MB in a file and you have 128 MB blocks, you still only use
10 MB (times 3 with default replication).

However, the namenode does incur additional overhead by having to
track a larger number of small files. So, if you can merge files, it's
best practice to do so.


On Tue, Sep 20, 2011 at 9:54 PM, hao.wang <[EMAIL PROTECTED]> wrote:
> Hi All:
>   I have lots of small files stored in HDFS. My HDFS block size is 128M. Each file is significantly smaller than the HDFS block size.  Then, I want to know whether the small file used 128M in HDFS?
> regards
> 2011-09-21
> hao.wang

Joseph Echeverria
Cloudera, Inc.