Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Datanode fencing mechanism


Copy link to this message
-
Datanode fencing mechanism
In https://issues.apache.org/jira/browse/HDFS-1972 jira, there is one below
case:
Scenario 3: DN restarts during split brain period

(this scenario illustrates why I think we need to persistently record the
promise about who is active)

   - block has 2 replicas, user asks to reduce to 1
   - NN1 adds the block to DN1's invalidation queue, but it's backed up
   behind a bunch of other commands, so doesn't get issued yet.
   - Failover occurs, but NN1 still thinks it's active.
   - DN1 promises to NN2 not to accept commands from NN1. It sends an empty
   deletion report to NN2. Then, it crashes.
   - NN2 has received a deletion report from everyone, and asks DN2 to
   delete the block. It hasn't realized that DN1 is crashed yet.
   - DN2 deletes the block.
   - DN1 starts back up. When it comes back up, it talks to NN1 first
   (maybe it takes a while to connect to NN2 for some reason)
      - ** Now, if we had saved the "promise" as part of persistent state,
      we could ignore NN1 and avoid this issue. Otherwise:
      - NN1 still thinks it's active, and sends a command to DN1 to delete
      the block. DN1 does so.
      - We lost the bloc
I am use the CDH4.3.1 version, and am reading the DataNode code. I don't
find the DataNode to save the "promise" as part of persistent state.   I
want to know whether the case 3 is handled in CDH4.3.1 version.  If  the
case is hadnled, where is the code?
Thanks,

LiuLe
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB