|
|
-
mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum
Mohit Anchlia 2012-03-10, 00:42
What's the difference between mapred.tasktracker.reduce.tasks.maximum and mapred.map.tasks ** I want my data to be split against only 10 mappers in the entire cluster. Can I do that using one of the above parameters?
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum
Chen He 2012-03-10, 01:00
Hi Mohit
" mapred.tasktracker.reduce(map).tasks.maximum " means how many reduce(map) slot(s) you can have on each tasktracker.
"mapred.job.reduce(maps)" means default number of reduce (map) tasks your job will has.
To set the number of mappers in your application. You can write like this:
*configuration.setNumMapTasks(the number you want);*
Chen
Actually, you can just use configuration.set()
On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
> What's the difference between mapred.tasktracker.reduce.tasks.maximum and > mapred.map.tasks > ** > I want my data to be split against only 10 mappers in the entire cluster. > Can I do that using one of the above parameters? >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum
Mohit Anchlia 2012-03-10, 01:19
What's the difference between setNumMapTasks and mapred.map.tasks?
On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote:
> Hi Mohit > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many reduce(map) > slot(s) you can have on each tasktracker. > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks your > job will has. > > To set the number of mappers in your application. You can write like this: > > *configuration.setNumMapTasks(the number you want);* > > Chen > > Actually, you can just use configuration.set() > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > >wrote: > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum and > > mapred.map.tasks > > ** > > I want my data to be split against only 10 mappers in the entire > cluster. > > Can I do that using one of the above parameters? > > >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum
Chen He 2012-03-10, 02:19
if you do not specify setNumMapTasks, by default, system will use the number you configured for "mapred.map.tasks" in the conf/mapred-site.xml file.
On Fri, Mar 9, 2012 at 7:19 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:
> What's the difference between setNumMapTasks and mapred.map.tasks? > > On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote: > > > Hi Mohit > > > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many > reduce(map) > > slot(s) you can have on each tasktracker. > > > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks your > > job will has. > > > > To set the number of mappers in your application. You can write like > this: > > > > *configuration.setNumMapTasks(the number you want);* > > > > Chen > > > > Actually, you can just use configuration.set() > > > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > > >wrote: > > > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum > and > > > mapred.map.tasks > > > ** > > > I want my data to be split against only 10 mappers in the entire > > cluster. > > > Can I do that using one of the above parameters? > > > > > >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum
Mohit Anchlia 2012-03-10, 04:34
Is this system parameter too? Or can I specify as mapred.map.tasks? I am using pig.
On Fri, Mar 9, 2012 at 6:19 PM, Chen He <[EMAIL PROTECTED]> wrote:
> if you do not specify setNumMapTasks, by default, system will use the > number you configured for "mapred.map.tasks" in the conf/mapred-site.xml > file. > > On Fri, Mar 9, 2012 at 7:19 PM, Mohit Anchlia <[EMAIL PROTECTED] > >wrote: > > > What's the difference between setNumMapTasks and mapred.map.tasks? > > > > On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote: > > > > > Hi Mohit > > > > > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many > > reduce(map) > > > slot(s) you can have on each tasktracker. > > > > > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks > your > > > job will has. > > > > > > To set the number of mappers in your application. You can write like > > this: > > > > > > *configuration.setNumMapTasks(the number you want);* > > > > > > Chen > > > > > > Actually, you can just use configuration.set() > > > > > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > > > >wrote: > > > > > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum > > and > > > > mapred.map.tasks > > > > ** > > > > I want my data to be split against only 10 mappers in the entire > > > cluster. > > > > Can I do that using one of the above parameters? > > > > > > > > > >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum
bejoy.hadoop@... 2012-03-10, 06:35
Mohit It is a job level config parameter. For plain map reduce jobs you can set the same through CLI as hadoop jar ... -D mapred.map.tasks=n You should be able to do it pig as well.
However the number of map tasks for a job are governed by the input splits and the Input Format you are using. So setting this config parameter doesn't guarantee that your job would have the specified number of map tasks. Normally you set the number of reduce tasks this way for your job, mapred.reduce.tasks=n
Hope it helps Regards Bejoy K S
From handheld, Please excuse typos.
-----Original Message----- From: Mohit Anchlia <[EMAIL PROTECTED]> Date: Fri, 9 Mar 2012 20:34:33 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum
Is this system parameter too? Or can I specify as mapred.map.tasks? I am using pig.
On Fri, Mar 9, 2012 at 6:19 PM, Chen He <[EMAIL PROTECTED]> wrote:
> if you do not specify setNumMapTasks, by default, system will use the > number you configured for "mapred.map.tasks" in the conf/mapred-site.xml > file. > > On Fri, Mar 9, 2012 at 7:19 PM, Mohit Anchlia <[EMAIL PROTECTED] > >wrote: > > > What's the difference between setNumMapTasks and mapred.map.tasks? > > > > On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote: > > > > > Hi Mohit > > > > > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many > > reduce(map) > > > slot(s) you can have on each tasktracker. > > > > > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks > your > > > job will has. > > > > > > To set the number of mappers in your application. You can write like > > this: > > > > > > *configuration.setNumMapTasks(the number you want);* > > > > > > Chen > > > > > > Actually, you can just use configuration.set() > > > > > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > > > >wrote: > > > > > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum > > and > > > > mapred.map.tasks > > > > ** > > > > I want my data to be split against only 10 mappers in the entire > > > cluster. > > > > Can I do that using one of the above parameters? > > > > > > > > > >
|
|