Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Meaning of UNDER_RECOVERY blocks


Copy link to this message
-
Meaning of UNDER_RECOVERY blocks
Hi,

I had an instance where a datanode died while writing the block I am using
Hadoop 2.0 patched with HDFS 3703 for stale node detection every 20 seconds.

The block being written to, went into the UNDER_RECOVERY state looking at
the namenode logs and there were several internalRecoverLease() calls
because there were readers on that blcok. I had a couple of questions about
the code;

1) I see that when a block is UNDER_RECOVERY, it is added to recoverBlocks
for each dataNodeDescriptor that holds the block. Then a recoverBlock call
is issued to each primary data node. What does the recoverBlock call do on
a datanode - does it sync the block on that node to other 2 data nodes. In
my case one of the data node is unreachable, what is the behaviour in such
a case ?

2) When a client wants to read a block which is "UNDER_RECOVERY" - do we
continue to suggest all 3 data nodes as replicas for reads or we pick the
one which is marked as primary for the block recovery ?

Thanks
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB