-Re: Separate communications of HDFS and MapReduce
Allen Wittenauer 2010-04-26, 18:10
On Apr 26, 2010, at 6:23 AM, Druilhe Remi wrote:
> For example, when I run "wordcount" example, there is HDFS communications and MapReduce communications and I am not able to distinguish which packet belong to HDFS or to MapReduce.
This shouldn't be too surprising given that the MapReduce job needs to talk to HDFS to determine input and to write output.
> A way could be to use odd port number for HDFS and even port number for MapReduce, but I think I have to modify source code.
The ports for the services are already separated out.
In general, client -> server connections map out as:
MR -> MR, HDFS
HDFS -> HDFS
Given a small 3 node grid, a dump of what processes open what ports, and what connections are made between all the machines, it should be trivial to make a more complex connection map. [You can probably even do it as a map reduce job. :) ]