Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> number of mapper tasks


Copy link to this message
-
number of mapper tasks
Hello,

    I am using hadoop with TextInputFormat, a mapper and no reducers. I am
running my jobs at Amazon EMR. When I run my job, I set both following
options:
-s,mapred.tasktracker.map.tasks.maximum=10
-jobconf,mapred.map.tasks=10
    When I run my job with just 1 instance, I see it only creates 1 mapper.
When I run my job with 5 instances (1 master and 4 cores), I can see only 2
mapper slots are used and 6 stay open.

     I am trying to figure why I am not being able to run more mappers in
parallel. When I see the logs, I find some messages like these:

INFO org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Scheduled 0 outputs (0 slow hosts
and0 dup hosts)
org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Need another 1 map output(s)
where 0 is already in progress

    Any hints? They would be highly appreciatted.

Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr