|
|
-
Re: how blocks are replicatedKonstantin Boudnik 2009-11-16, 21:51
1. Your first quess is right - file is 'broken' into blocks which are then
stored according to the replication policy and other things. 2. It doesn't happen automatically, as far as I know. One has to 're-balance' the cluster in this case. -- Take care, Cos On 11/16/09 13:47 , Massoud Mazar wrote: > This is probably a basic question: > > Assuming replication is set to 3, when we store a large file in HDFS, is > the whole file stored in 3 nodes (even if you have many more nodes) or > it is broken into blocks and each block is written to 3 nodes? (I assume > it is the latter, so when you have 30 nodes available, each one gets a > piece of the file, providing more performance when reading the file). > > My second question is what happens if we add more nodes to an existing > cluster? Would any existing blocks be moved to these new nodes to expand > the distribution of the data to new nodes? > > Thanks > > Massoud > |