|
|
-
Re: increase number of map tasksBejoy Ks 2012-01-09, 17:41
Hi Satish
What is your value for mapred.max.split.size? Try setting these values as well mapred.min.split.size=0 (it is the default value) mapred.max.split.size=40 Try executing your job once you apply these changes on top of others you did. Regards Bejoy.K.S On Mon, Jan 9, 2012 at 10:16 PM, sset <[EMAIL PROTECTED]> wrote: > > Hello, > > In hdfs we have set block size - 40bytes . Input Data set is as below > terminated with line feed. > > data1 (5*8=40 bytes) > data2 > ...... > ....... > data10 > > > But still we see only 2 map tasks spawned, should have been atleast 10 map > tasks. Each mapper performs complex mathematical computation. Not sure how > works internally. Line feed does not work. Even with below settings map > tasks never goes beyound 2, any way to make this spawn 10 tasks. Basically > it should look like compute grid - computation in parallel. > > <property> > <name>io.bytes.per.checksum</name> > <value>30</value> > <description>The number of bytes per checksum. Must not be larger than > io.file.buffer.size.</description> > </property> > > <property> > <name>dfs.block.size</name> > <value>30</value> > <description>The default block size for new files.</description> > </property> > > <property> > <name>mapred.tasktracker.map.tasks.maximum</name> > <value>10</value> > <description>The maximum number of map tasks that will be run > simultaneously by a task tracker. > </description> > </property> > > single node with high configuration -> 8 cpus and 8gb memory. Hence taking > an example of 10 data items with line feeds. We want to utilize full power > of machine - hence want at least 10 map tasks - each task needs to perform > highly complex mathematical simulation. At present it looks like file data > is the only way to specify number of map tasks via splitsize (in bytes) - > but I prefer some criteria like line feed or whatever. > > How do we get 10 map tasks from above configuration - pls help. > > thanks > > -- > View this message in context: > http://old.nabble.com/increase-number-of-map-tasks-tp33107775p33107775.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > > |