Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - can local disk of reduce task cause the job to fail?


Copy link to this message
-
Re: can local disk of reduce task cause the job to fail?
Mohit Anchlia 2012-12-09, 17:15
Reducer will not start executing until shuffle and sort phase is complete

Sent from my iPhone

On Dec 9, 2012, at 4:09 AM, Majid Azimi <[EMAIL PROTECTED]> wrote:

> Hi guys,
>
> Hadoop the definitive guide says: reduce tasks will start only when all maps has done their work.  Also this link says:
>
> >> The shuffle and sort phases occur simultaneously; while map-outputs are being fetched they are merged.
>
> What I have understood is that when a reducer task starts then all data it needs(including a key and associated values) have been transferred to its local node. Am I right? if this is true then, the node running reduce task must have enough storage to hold all values associated with that key, else The job will fail.
>
> If no, then reduce job starts with some available data and shuffle + sort phase feed reduce task contiguously, thus low storage on node does not cause problem because data is coming on demand.
>
> which of the two cases actually happen?