When MR assigns data splits to map tasks, does it assign a set of non-contiguous blocks to one map? The reason I ask is, thinking through the problem, if I were the MR scheduler I would attempt to hand a map task a bunch of blocks that all exist on the same datanode, and then schedule the map task on that node. E.g. if I have an HDFS file with 10000 blocks and I want to create 1000 map tasks I'd like each map task to have 10 blocks, but those blocks are unlikely to be contiguous on a given datanode.
This is related to a question I had asked earlier, which is whether any benefit could be had by aligning data splits along block boundaries to avoid slopping reads of a block to the next block and requiring another datanode connection. The answer I got was that the extra connection overhead wasn't important. The reason I bring this up again is that comments in this discussion (https://issues.apache.org/jira/browse/HADOOP-3315) imply that doing an extra seek to the beginning of the file to read a magic number on open is a significant overhead, and this looks like a similar issue to me.
John Lilley 2013-07-01, 23:07