Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> increase number of map tasks


Copy link to this message
-
Re: increase number of map tasks
Hi Satish
      What is your value for mapred.max.split.size? Try setting these
values as well
mapred.min.split.size=0 (it is the default value)
mapred.max.split.size=40

Try executing your job once you apply these changes on top of others you
did.

Regards
Bejoy.K.S

On Mon, Jan 9, 2012 at 10:16 PM, sset <[EMAIL PROTECTED]> wrote:

>
> Hello,
>
> In hdfs we have set block size - 40bytes . Input Data set is as below
> terminated with line feed.
>
> data1   (5*8=40 bytes)
> data2
> ......
> .......
> data10
>
>
> But still we see only 2 map tasks spawned, should have been atleast 10 map
> tasks. Each mapper performs complex mathematical computation. Not sure how
> works internally. Line feed does not work. Even with below settings map
> tasks never goes beyound 2, any way to make this spawn 10 tasks. Basically
> it should look like compute grid - computation in parallel.
>
> <property>
>  <name>io.bytes.per.checksum</name>
>  <value>30</value>
>  <description>The number of bytes per checksum.  Must not be larger than
>  io.file.buffer.size.</description>
> </property>
>
> <property>
>  <name>dfs.block.size</name>
>   <value>30</value>
>  <description>The default block size for new files.</description>
> </property>
>
> <property>
>  <name>mapred.tasktracker.map.tasks.maximum</name>
>  <value>10</value>
>  <description>The maximum number of map tasks that will be run
>  simultaneously by a task tracker.
>  </description>
> </property>
>
> single node with high configuration -> 8 cpus and 8gb memory. Hence taking
> an example of 10 data items with line feeds. We want to utilize full power
> of machine - hence want at least 10 map tasks - each task needs to perform
> highly complex mathematical simulation.  At present it looks like file data
> is the only way to specify number of map tasks via splitsize (in bytes) -
> but I prefer some criteria like line feed or whatever.
>
> How do we get 10 map tasks from above configuration - pls help.
>
> thanks
>
> --
> View this message in context:
> http://old.nabble.com/increase-number-of-map-tasks-tp33107775p33107775.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>