-Re: doubt about reduce tasks and block writes
Raj Vishwanathan 2012-08-24, 22:44
But since node A has no TT running, it will not run map or reduce tasks. When the reducer node writes the output file, the fist block will be written on the local node and never on node A.
So, to answer the question, Node A will contain copies of blocks of all output files. It wont contain the copy 0 of any output file.
I am reasonably sure about this , but there could be corner cases in case of node failure and such like! I need to look into the code.
> From: Marc Sturlese <[EMAIL PROTECTED]>
>To: [EMAIL PROTECTED]
>Sent: Friday, August 24, 2012 1:09 PM
>Subject: doubt about reduce tasks and block writes
>I have a doubt about reduce tasks and block writes. Do a reduce task always
>first write to hdfs in the node where they it is placed? (and then these
>blocks would be replicated to other nodes)
>In case yes, if I have a cluster of 5 nodes, 4 of them run DN and TT and one
>(node A) just run DN, when running MR jobs, map tasks would never read from
>node A? This would be because maps have data locality and if the reduce
>tasks write first to the node where they live, one replica of the block
>would always be in a node that has a TT. Node A would just contain blocks
>created from replication by the framework as no reduce task would write
>there directly. Is this correct?
>Thanks in advance
>View this message in context: http://lucene.472066.n3.nabble.com/doubt-about-reduce-tasks-and-block-writes-tp4003185.html
>Sent from the Hadoop lucene-users mailing list archive at Nabble.com.