Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> can local disk of reduce task cause the job to fail?


+
Majid Azimi 2012-12-09, 12:09
Copy link to this message
-
Re: can local disk of reduce task cause the job to fail?
Reducer will not start executing until shuffle and sort phase is complete

Sent from my iPhone

On Dec 9, 2012, at 4:09 AM, Majid Azimi <[EMAIL PROTECTED]> wrote:

> Hi guys,
>
> Hadoop the definitive guide says: reduce tasks will start only when all maps has done their work.  Also this link says:
>
> >> The shuffle and sort phases occur simultaneously; while map-outputs are being fetched they are merged.
>
> What I have understood is that when a reducer task starts then all data it needs(including a key and associated values) have been transferred to its local node. Am I right? if this is true then, the node running reduce task must have enough storage to hold all values associated with that key, else The job will fail.
>
> If no, then reduce job starts with some available data and shuffle + sort phase feed reduce task contiguously, thus low storage on node does not cause problem because data is coming on demand.
>
> which of the two cases actually happen?
+
jamal sasha 2012-12-09, 17:19
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB