jeremy p 2013-03-13, 20:01
I think in your case it will have to be even, because all the slots will get filled. A more interesting case is if you have 40 nodes, will you get exactly 5 slots used for each of the nodes? Or will some nodes get more than 5 mappers, and others less? I don't remember the details, but I've had problems with unevenness in such scenarios. At least in MR1, you can usually force evenness by adjusting the number of map and reduce slots per node. In MR2 the slots are combined so achieving evenness will be more difficult.
----- Original Message -----
From: "jeremy p" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Wednesday, March 13, 2013 1:01:46 PM
Subject: Will hadoop always spread the work evenly between nodes?
Say I have 200 input files and 20 nodes, and each node has 10 mapper slots. Will Hadoop always allocate the work evenly, such that each node will get 10 input files and simultaneously start 10 mappers? Is there a way to force this behavior?