-Re: When and who move the reduce output file part-0000X to the final output directory
Ling Kun 2013-05-10, 05:40
your reply helps me a lot.
On Fri, May 10, 2013 at 1:26 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> The task itself moves it when it receives a commitTask message. See
> the OutputCommitter class:
> On Fri, May 10, 2013 at 8:49 AM, Ling Kun <[EMAIL PROTECTED]> wrote:
> > Dear all,
> > I am looking into the MR work flow, and want to know more details
> > the reduce output data copy .
> > Here is my question.
> > For the DFSIO test or some other MR jobs. Each reduce task will run
> on a
> > TT, and generate files to some dirs named like this: "
> > XXX//_temporary/_attempt_201305101045_0005_r_000000_0/", there will also
> > a result file named part-00000.
> > After the reducer done the task. the reducer output data part-00000
> > be moved from the local disk to the HDFS.
> > My question is: Is that the time that when reducer finish the task that
> > part-00000 will be copied to the HDFS? Who make this file copy happen?
> > Reducer child? The TaskTracker which run the reduce task? Or the
> > Thanks,
> > yours,
> > Kun Ling
> > --
> > http://www.lingcc.com
> Harsh J