|
|
+
Majid Azimi 2012-12-09, 12:09
-
Re: can local disk of reduce task cause the job to fail?Mohit Anchlia 2012-12-09, 17:15
Reducer will not start executing until shuffle and sort phase is complete
Sent from my iPhone On Dec 9, 2012, at 4:09 AM, Majid Azimi <[EMAIL PROTECTED]> wrote: > Hi guys, > > Hadoop the definitive guide says: reduce tasks will start only when all maps has done their work. Also this link says: > > >> The shuffle and sort phases occur simultaneously; while map-outputs are being fetched they are merged. > > What I have understood is that when a reducer task starts then all data it needs(including a key and associated values) have been transferred to its local node. Am I right? if this is true then, the node running reduce task must have enough storage to hold all values associated with that key, else The job will fail. > > If no, then reduce job starts with some available data and shuffle + sort phase feed reduce task contiguously, thus low storage on node does not cause problem because data is coming on demand. > > which of the two cases actually happen? +
jamal sasha 2012-12-09, 17:19
|