-Re: what happens when a datanode rejoins?
Harsh J 2012-09-11, 08:03
George has answered most of these. I'll just add on:
On Tue, Sep 11, 2012 at 12:44 PM, Mehul Choube
<[EMAIL PROTECTED]> wrote:
> 1. Some of the blocks it was managing are deleted/modified?
A DN runs a block report upon start, and sends the list of blocks to
the NN. NN validates them and if it finds any files to miss block
replicas post-report, it will schedule a re-replication from one of
the good DNs that still carry it. The modified (out-of-HDFS) blocks
fail their stored checksums so are treated as corrupt and deleted, and
are re-replicated in the same manner.
> 2. The size of the blocks are now modified say from 64MB to 128MB?
George's got this already. Changing of block size does not impact any
existing blocks. It is a per-file metadata prop.
> 3. What if the block replication factor was one (yea not in most
> deployments but say incase) so does the namenode recreate a file once the
> datanode rejoins?
Files exist at the NN metadata (its fsimage/edits persist this).
Blocks pertaining to a file exists at a DN. If the file had a single
replica and that replica was lost, then the file's data is lost and
the NameNode will tell you as much in its metrics/fsck.