Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> question about hdfs data loss risk

Copy link to this message
Re: question about hdfs data loss risk

1) You may want to read about proper node decommissioning.

2) NameNode will replicate blocks when they do not comply with their
replication factor.

3) NameNode does not give up.

4) Yes, ultimately, if you have a replication factor of n and the n
replicas are lost at the same time, well, the data is truly lost. But
that's not specific to Hadoop.

On Sun, Oct 27, 2013 at 7:42 PM, Koert Kuipers <[EMAIL PROTECTED]> wrote:

> i have a cluster with replication factor 2. wit the following events in
> this order, do i have data loss?
> 1) shut down a datanode for maintenance unrelated to hdfs. so now some
> blocks only have replication factor 1
> 2) a disk dies in another datanode. let's assume some blocks now have
> replication factor 0 since they were on this disk that died and on the
> datanode that is shut down for maintenance.
> 3) bring back up the datanode that was down for maintenance.
> what i am worried about is that the namenode gives up on a block with
> replication factor 0 after steps 1) and 2) and considers it lost, and by
> the time the replica will come back on again in step 3) the namenode no
> longer considers the block to be existent.
> thanks! koert
Bertrand Dechoux