Oh I see. Does this mean there is another service and TCP listen port for this purpose?
Thanks for your indulgence... I would really like to read more about this without bothering the group but not sure where to start to learn these internals other than the code.
From: Kai Voigt [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, May 21, 2013 12:59 PM
To: [EMAIL PROTECTED]
Subject: Re: Shuffle phase replication factor
The map output doesn't get written to HDFS. The map task writes its output to its local disk, the reduce tasks will pull the data through HTTP for further processing.
Am 21.05.2013 um 19:57 schrieb John Lilley <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>:
When MapReduce enters "shuffle" to partition the tuples, I am assuming that it writes intermediate data to HDFS. What replication factor is used for those temporary files?
[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>