There're lots of materials from internet suggest to set dfs.block.size
larger, e.g. from 64M to 256M, when the job is large. And they said the
performance would improve. But I'm not clear why increse the block size will
improve. I know that increase block size will reduce the map task number for
the same input, but why lesser map tasks will improve overall performance?
Any comments would be highly valued, and thanks in advance.