Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Why is Hadoop always running just 4 tasks?


Copy link to this message
-
Re: Why is Hadoop always running just 4 tasks?
Rahul Bhattacharjee 2013-12-11, 16:12
Not sure , If I understand the question correctly. The reducers would start
only after all the mappers are complete.

=Rahul
On Wed, Dec 11, 2013 at 7:59 AM, Dror, Ittay <[EMAIL PROTECTED]> wrote:

> I have a cluster of 4 machines with 24 cores and 7 disks each.
>
> On each node I copied from local a file of 500G. So I have 4 files in hdfs
> with many blocks. My replication factor is 1.
>
> I run a job (a scalding flow) and while there are 96 reducers pending,
> there are only 4 active map tasks.
>
> What am I doing wrong? Below is the configuration
>
> Thanks,
> Ittay
>
> <configuration>
> <property>
>  <name>mapred.job.tracker</name>
>    <value>master:54311</value>
>  </property>
>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>96</value>
> </property>
>
> <property>
>    <name>mapred.reduce.tasks</name>
>    <value>96</value>
>  </property>
>
> <property>
> <name>mapred.local.dir</name>
>
> <value>/hdfs/0/mapred/local,/hdfs/1/mapred/local,/hdfs/2/mapred/local,/hdfs/3/mapred/local,/hdfs/4/mapred/local,/hdfs/5/mapred/local,/hdfs/6/mapred/local,/hdfs/7/mapred/local</value>
>  </property>
>
> <property>
> <name>mapred.tasktracker.map.tasks.maximum</name>
>  <value>24</value>
> </property>
>
> <property>
>     <name>mapred.tasktracker.reduce.tasks.maximum</name>
>     <value>24</value>
> </property>
> </configuration>
>