Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - what happens when a datanode rejoins?

Mehul Choube 2012-09-11, 07:14
Copy link to this message
Re: what happens when a datanode rejoins?
Harsh J 2012-09-11, 08:03
George has answered most of these. I'll just add on:

On Tue, Sep 11, 2012 at 12:44 PM, Mehul Choube
> 1.       Some of the blocks it was managing are deleted/modified?

A DN runs a block report upon start, and sends the list of blocks to
the NN. NN validates them and if it finds any files to miss block
replicas post-report, it will schedule a re-replication from one of
the good DNs that still carry it. The modified (out-of-HDFS) blocks
fail their stored checksums so are treated as corrupt and deleted, and
are re-replicated in the same manner.

> 2.       The size of the blocks are now modified say from 64MB to 128MB?

George's got this already. Changing of block size does not impact any
existing blocks. It is a per-file metadata prop.

> 3.       What if the block replication factor was one (yea not in most
> deployments but say incase) so does the namenode recreate a file once the
> datanode rejoins?

Files exist at the NN metadata (its fsimage/edits persist this).
Blocks pertaining to a file exists at a DN. If the file had a single
replica and that replica was lost, then the file's data is lost and
the NameNode will tell you as much in its metrics/fsck.

Harsh J
George Datskos 2012-09-11, 07:25
George Datskos 2012-09-11, 07:32