|
|
-
Re: What does mapred.map.tasksperslot do?Hemanth Yamijala 2012-12-27, 08:43
David,
Could you please tell what version of Hadoop you are using ? I don't see this parameter in the stable (1.x) or current branch. I only see references to it with respect to EMR and with Hadoop 0.18 or so. On Thu, Dec 27, 2012 at 1:51 PM, David Parks <[EMAIL PROTECTED]> wrote: > I didn’t come up with much in a google search.**** > > ** ** > > In particular, what are the side effects of changing this setting? Memory? > Sort process?**** > > ** ** > > I’m guessing it means that it’ll feed 2 map tasks as input to each map > task, a map task in turn is a self-contained JVM which consumes one map > slot.**** > > ** ** > > Thus 4 map slots, and 2 tasksperslot means 4 map task JVMs each of which > process 2 input splits at a time.**** > > ** ** > > By increasing the tasksperslot I presume we reduce overhead needed to > start a new task (even though we’re re-using the JVM in typical > configuration, ours included), but we have more map output to sort and > shuffle (I presume the results of both map splits go into the same output). > **** > > ** ** > > Can someone verify those presumptions?**** > +
David Parks 2012-12-27, 09:42
|