Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: Running terasort with 1 map task


Copy link to this message
-
Re: Running terasort with 1 map task
http://wiki.apache.org/hadoop/HowManyMapsAndReduces

It is possible to have a single mapper if the input is not splittable BUT
it is rarely seen as a feature.
One could ask why you want to use a platform for distributed computing for
a job that shouldn't be distributed.

Regards

Bertrand
On Tue, Feb 26, 2013 at 12:09 PM, Arindam Choudhury <
[EMAIL PROTECTED]> wrote:

> Hi all,
>
> I am trying to run terasort using one map and one reduce. so, I generated
> the input data using:
>
> hadoop jar hadoop-examples-1.0.4.jar teragen -Dmapred.map.tasks=1
> -Dmapred.reduce.tasks=1 32000000 /user/hadoop/input32mb1map
>
> Then I launched the hadoop terasort job using:
>
> hadoop jar hadoop-examples-1.0.4.jar terasort -Dmapred.map.tasks=1
> -Dmapred.reduce.tasks=1 /user/hadoop/input32mb1map /user/hadoop/output1
>
> I thought it will run the job using 1 map and 1 reduce, but when inspect
> the job statistics I found:
>
> hadoop job -history /user/hadoop/output1
>
> Task Summary
> ===========================> Kind    Total    Successful    Failed    Killed    StartTime    FinishTime
>
> Setup    1    1        0    0    26-Feb-2013 10:57:47    26-Feb-2013
> 10:57:55 (8sec)
> Map    24    24        0    0    26-Feb-2013 10:57:57    26-Feb-2013
> 11:05:37 (7mins, 40sec)
> Reduce    1    1        0    0    26-Feb-2013 10:58:21    26-Feb-2013
> 11:08:31 (10mins, 10sec)
> Cleanup    1    1        0    0    26-Feb-2013 11:08:32    26-Feb-2013
> 11:08:36 (4sec)
> ===========================>
> so, though I mentioned to launch one map tasks, there are 24 of them.
>
> How to solve this problem. How to tell hadoop to launch only one map.
>
> Thanks,
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB