Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Reduce side question on MR


Copy link to this message
-
Re: Reduce side question on MR
Rahul Bhattacharjee 2013-06-01, 07:33
Thanks Harsh for the response. It very much answers what I was looking for.

Regards,
Rahul
On Wed, May 29, 2013 at 8:10 PM, Rahul Bhattacharjee <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> I have one question related to the reduce phase of MR jobs.
>
> The intermediate outputs of map tasks are pulled in from the nodes which
> ran map tasks to the node where reducers is going to run and those
> intermediate data is written to the reducers local fs. My question is that
> if there is a job processing huge amount of data and it has multiple
> mappers but only one reducer , then its possible that the job would never
> complete successfully as the single hosts disk might not be sufficient to
> hold all the map outputs of the job.
>
> The job essentially would fail after retrying configured number of
> attempts.
>
> Thanks,
> Rahul
>