Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> how blocks are replicated

Copy link to this message
Re: how blocks are replicated
1. Your first quess is right - file is 'broken' into blocks which are then
stored according to the replication policy and other things.

2. It doesn't happen automatically, as far as I know. One has to 're-balance'
the cluster in this case.

Take care,

On 11/16/09 13:47 , Massoud Mazar wrote:
> This is probably a basic question:
> Assuming replication is set to 3, when we store a large file in HDFS, is
> the whole file stored in 3 nodes (even if you have many more nodes) or
> it is broken into blocks and each block is written to 3 nodes? (I assume
> it is the latter, so when you have 30 nodes available, each one gets a
> piece of the file, providing more performance when reading the file).
> My second question is what happens if we add more nodes to an existing
> cluster? Would any existing blocks be moved to these new nodes to expand
> the distribution of the data to new nodes?
> Thanks
> Massoud