Would be nice if someone could help out with this - it looks like a trivial
question - but seems like some blocks are being lost for us when datanodes
On Fri, Apr 19, 2013 at 2:28 PM, Varun Sharma <[EMAIL PROTECTED]> wrote:
> I had an instance where a datanode died while writing the block I am using
> Hadoop 2.0 patched with HDFS 3703 for stale node detection every 20 seconds.
> The block being written to, went into the UNDER_RECOVERY state looking at
> the namenode logs and there were several internalRecoverLease() calls
> because there were readers on that blcok. I had a couple of questions about
> the code;
> 1) I see that when a block is UNDER_RECOVERY, it is added to recoverBlocks
> for each dataNodeDescriptor that holds the block. Then a recoverBlock call
> is issued to each primary data node. What does the recoverBlock call do on
> a datanode - does it sync the block on that node to other 2 data nodes. In
> my case one of the data node is unreachable, what is the behaviour in such
> a case ?
> 2) When a client wants to read a block which is "UNDER_RECOVERY" - do we
> continue to suggest all 3 data nodes as replicas for reads or we pick the
> one which is marked as primary for the block recovery ?