Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> What does mapred.map.tasksperslot do?


Copy link to this message
-
What does mapred.map.tasksperslot do?
I didn't come up with much in a google search.

 

In particular, what are the side effects of changing this setting? Memory?
Sort process?

 

I'm guessing it means that it'll feed 2 map tasks as input to each map task,
a map task in turn is a self-contained JVM which consumes one map slot.

 

Thus 4 map slots, and 2 tasksperslot means 4 map task JVMs each of which
process 2 input splits at a time.

 

By increasing the tasksperslot I presume we reduce overhead needed to start
a new task (even though we're re-using the JVM in typical configuration,
ours included), but we have more map output to sort and shuffle (I presume
the results of both map splits go into the same output).

 

Can someone verify those presumptions?