-What does mapred.map.tasksperslot do?
David Parks 2012-12-27, 08:21
I didn't come up with much in a google search.
In particular, what are the side effects of changing this setting? Memory?
I'm guessing it means that it'll feed 2 map tasks as input to each map task,
a map task in turn is a self-contained JVM which consumes one map slot.
Thus 4 map slots, and 2 tasksperslot means 4 map task JVMs each of which
process 2 input splits at a time.
By increasing the tasksperslot I presume we reduce overhead needed to start
a new task (even though we're re-using the JVM in typical configuration,
ours included), but we have more map output to sort and shuffle (I presume
the results of both map splits go into the same output).
Can someone verify those presumptions?