Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Set reducer capacity for a specific M/R job


+
Han JU 2013-04-30, 10:00
+
Nitin Pawar 2013-04-30, 10:26
+
Han JU 2013-04-30, 10:32
+
Nitin Pawar 2013-04-30, 10:35
+
Han JU 2013-04-30, 10:38
+
Nitin Pawar 2013-04-30, 10:45
Copy link to this message
-
Re: Set reducer capacity for a specific M/R job
forgot to add there is similar method for reducer as well

job.setNumReduceTasks(0);
On Tue, Apr 30, 2013 at 3:56 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:

> The *mapred*.*tasktracker*.*reduce*.*tasks*.*maximum* parameter sets the
> maximum number of reduce tasks that may be run by an individual TaskTracker
> server at one time. This is not per job configuration.
>
> he number of map tasks for a given job is driven by the number of input
> splits and not by the mapred.map.tasks parameter. For each input split a
> map task is spawned. So, over the lifetime of a mapreduce job the number of
> map tasks is equal to the number of input splits. mapred.map.tasks is just
> a hint to the InputFormat for the number of maps
>
> If you want to set max number of maps or reducers per job then you can set
> the hints by using the job object you created
> job.setNumMapTasks()
>
> Note this is just a hint and again the number will be decided by the input
> split size.
>
>
> On Tue, Apr 30, 2013 at 3:39 PM, Han JU <[EMAIL PROTECTED]> wrote:
>
>> Thanks Nitin.
>>
>> What I need is to set slot only for a specific job, not for the whole
>> cluster conf.
>> But what I did does NOT work ... Have I done something wrong?
>>
>>
>> 2013/4/30 Nitin Pawar <[EMAIL PROTECTED]>
>>
>>> The config you are setting is for job only
>>>
>>> But if you want to reduce the slota on tasktrackers then you will need
>>> to edit tasktracker conf and restart tasktracker
>>> On Apr 30, 2013 3:30 PM, "Han JU" <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to change the cluster's capacity of reduce slots on a per job
>>>> basis. Originally I have 8 reduce slots for a tasktracker.
>>>> I did:
>>>>
>>>> conf.set("mapred.tasktracker.reduce.tasks.maximum", "4");
>>>> ...
>>>> Job job = new Job(conf, ...)
>>>>
>>>>
>>>> And in the web UI I can see that for this job, the max reduce tasks is
>>>> exactly at 4, like I set. However hadoop still launches 8 reducer per
>>>> datanode ... why is this?
>>>>
>>>> How could I achieve this?
>>>> --
>>>> *JU Han*
>>>>
>>>> Software Engineer Intern @ KXEN Inc.
>>>> UTC   -  Université de Technologie de Compiègne
>>>> *     **GI06 - Fouille de Données et Décisionnel*
>>>>
>>>> +33 0619608888
>>>>
>>>
>>
>>
>> --
>> *JU Han*
>>
>> Software Engineer Intern @ KXEN Inc.
>> UTC   -  Université de Technologie de Compiègne
>> *     **GI06 - Fouille de Données et Décisionnel*
>>
>> +33 0619608888
>>
>
>
>
> --
> Nitin Pawar
>

--
Nitin Pawar