I had an instance where a datanode died while writing the block I am using
Hadoop 2.0 patched with HDFS 3703 for stale node detection every 20 seconds.
The block being written to, went into the UNDER_RECOVERY state looking at
the namenode logs and there were several internalRecoverLease() calls
because there were readers on that blcok. I had a couple of questions about
1) I see that when a block is UNDER_RECOVERY, it is added to recoverBlocks
for each dataNodeDescriptor that holds the block. Then a recoverBlock call
is issued to each primary data node. What does the recoverBlock call do on
a datanode - does it sync the block on that node to other 2 data nodes. In
my case one of the data node is unreachable, what is the behaviour in such
a case ?
2) When a client wants to read a block which is "UNDER_RECOVERY" - do we
continue to suggest all 3 data nodes as replicas for reads or we pick the
one which is marked as primary for the block recovery ?