I don't see a direct question asked, but here's a condition in the
source code you want to take a look at (*):
(*) - Yet to appear in MRv2 - See/help out with MAPREDUCE-2723.
On Wed, May 29, 2013 at 8:10 PM, Rahul Bhattacharjee
<[EMAIL PROTECTED]> wrote:
> I have one question related to the reduce phase of MR jobs.
> The intermediate outputs of map tasks are pulled in from the nodes which ran
> map tasks to the node where reducers is going to run and those intermediate
> data is written to the reducers local fs. My question is that if there is a
> job processing huge amount of data and it has multiple mappers but only one
> reducer , then its possible that the job would never complete successfully
> as the single hosts disk might not be sufficient to hold all the map outputs
> of the job.
> The job essentially would fail after retrying configured number of attempts.