Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> number of mapper tasks

Copy link to this message
number of mapper tasks

    I am using hadoop with TextInputFormat, a mapper and no reducers. I am
running my jobs at Amazon EMR. When I run my job, I set both following
    When I run my job with just 1 instance, I see it only creates 1 mapper.
When I run my job with 5 instances (1 master and 4 cores), I can see only 2
mapper slots are used and 6 stay open.

     I am trying to figure why I am not being able to run more mappers in
parallel. When I see the logs, I find some messages like these:

INFO org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Scheduled 0 outputs (0 slow hosts
and0 dup hosts)
org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Need another 1 map output(s)
where 0 is already in progress

    Any hints? They would be highly appreciatted.

Best regards,
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr