Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Reduce side question on MR

Copy link to this message
Re: Reduce side question on MR
I don't see a direct question asked, but here's a condition in the
source code you want to take a look at (*):

(*) - Yet to appear in MRv2 - See/help out with MAPREDUCE-2723.

On Wed, May 29, 2013 at 8:10 PM, Rahul Bhattacharjee
> Hi,
> I have one question related to the reduce phase of MR jobs.
> The intermediate outputs of map tasks are pulled in from the nodes which ran
> map tasks to the node where reducers is going to run and those intermediate
> data is written to the reducers local fs. My question is that if there is a
> job processing huge amount of data and it has multiple mappers but only one
> reducer , then its possible that the job would never complete successfully
> as the single hosts disk might not be sufficient to hold all the map outputs
> of the job.
> The job essentially would fail after retrying configured number of attempts.
> Thanks,
> Rahul

Harsh J