Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - number of mapper tasks


Copy link to this message
-
number of mapper tasks
Marcelo Elias Del Valle 2013-01-28, 15:54
Hello,

    I am using hadoop with TextInputFormat, a mapper and no reducers. I am
running my jobs at Amazon EMR. When I run my job, I set both following
options:
-s,mapred.tasktracker.map.tasks.maximum=10
-jobconf,mapred.map.tasks=10
    When I run my job with just 1 instance, I see it only creates 1 mapper.
When I run my job with 5 instances (1 master and 4 cores), I can see only 2
mapper slots are used and 6 stay open.

     I am trying to figure why I am not being able to run more mappers in
parallel. When I see the logs, I find some messages like these:

INFO org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Scheduled 0 outputs (0 slow hosts
and0 dup hosts)
org.apache.hadoop.mapred.ReduceTask (main):
attempt_201301281437_0001_r_000003_0 Need another 1 map output(s)
where 0 is already in progress

    Any hints? They would be highly appreciatted.

Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
+
Harsh J 2013-01-28, 16:02
+
Marcelo Elias Del Valle 2013-01-28, 16:31
+
Harsh J 2013-01-28, 16:41
+
Marcelo Elias Del Valle 2013-01-28, 16:55
+
Marcelo Elias Del Valle 2013-01-28, 20:56
+
Vinod Kumar Vavilapalli 2013-01-29, 02:08
+
Marcelo Elias Del Valle 2013-01-29, 10:52
+
Marcelo Elias Del Valle 2013-01-29, 12:53