|
Mohit Anchlia
2012-03-10, 00:42
Chen He
2012-03-10, 01:00
Mohit Anchlia
2012-03-10, 01:19
Chen He
2012-03-10, 02:19
Mohit Anchlia
2012-03-10, 04:34
bejoy.hadoop@...
2012-03-10, 06:35
|
-
mapred.map.tasks vs mapred.tasktracker.map.tasks.maximumMohit Anchlia 2012-03-10, 00:42
What's the difference between mapred.tasktracker.reduce.tasks.maximum and
mapred.map.tasks ** I want my data to be split against only 10 mappers in the entire cluster. Can I do that using one of the above parameters?
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximumChen He 2012-03-10, 01:00
Hi Mohit
" mapred.tasktracker.reduce(map).tasks.maximum " means how many reduce(map) slot(s) you can have on each tasktracker. "mapred.job.reduce(maps)" means default number of reduce (map) tasks your job will has. To set the number of mappers in your application. You can write like this: *configuration.setNumMapTasks(the number you want);* Chen Actually, you can just use configuration.set() On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote: > What's the difference between mapred.tasktracker.reduce.tasks.maximum and > mapred.map.tasks > ** > I want my data to be split against only 10 mappers in the entire cluster. > Can I do that using one of the above parameters? >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximumMohit Anchlia 2012-03-10, 01:19
What's the difference between setNumMapTasks and mapred.map.tasks?
On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote: > Hi Mohit > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many reduce(map) > slot(s) you can have on each tasktracker. > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks your > job will has. > > To set the number of mappers in your application. You can write like this: > > *configuration.setNumMapTasks(the number you want);* > > Chen > > Actually, you can just use configuration.set() > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > >wrote: > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum and > > mapred.map.tasks > > ** > > I want my data to be split against only 10 mappers in the entire > cluster. > > Can I do that using one of the above parameters? > > >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximumChen He 2012-03-10, 02:19
if you do not specify setNumMapTasks, by default, system will use the
number you configured for "mapred.map.tasks" in the conf/mapred-site.xml file. On Fri, Mar 9, 2012 at 7:19 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote: > What's the difference between setNumMapTasks and mapred.map.tasks? > > On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote: > > > Hi Mohit > > > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many > reduce(map) > > slot(s) you can have on each tasktracker. > > > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks your > > job will has. > > > > To set the number of mappers in your application. You can write like > this: > > > > *configuration.setNumMapTasks(the number you want);* > > > > Chen > > > > Actually, you can just use configuration.set() > > > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > > >wrote: > > > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum > and > > > mapred.map.tasks > > > ** > > > I want my data to be split against only 10 mappers in the entire > > cluster. > > > Can I do that using one of the above parameters? > > > > > >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximumMohit Anchlia 2012-03-10, 04:34
Is this system parameter too? Or can I specify as mapred.map.tasks? I am
using pig. On Fri, Mar 9, 2012 at 6:19 PM, Chen He <[EMAIL PROTECTED]> wrote: > if you do not specify setNumMapTasks, by default, system will use the > number you configured for "mapred.map.tasks" in the conf/mapred-site.xml > file. > > On Fri, Mar 9, 2012 at 7:19 PM, Mohit Anchlia <[EMAIL PROTECTED] > >wrote: > > > What's the difference between setNumMapTasks and mapred.map.tasks? > > > > On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote: > > > > > Hi Mohit > > > > > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many > > reduce(map) > > > slot(s) you can have on each tasktracker. > > > > > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks > your > > > job will has. > > > > > > To set the number of mappers in your application. You can write like > > this: > > > > > > *configuration.setNumMapTasks(the number you want);* > > > > > > Chen > > > > > > Actually, you can just use configuration.set() > > > > > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > > > >wrote: > > > > > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum > > and > > > > mapred.map.tasks > > > > ** > > > > I want my data to be split against only 10 mappers in the entire > > > cluster. > > > > Can I do that using one of the above parameters? > > > > > > > > > >
-
Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximumbejoy.hadoop@... 2012-03-10, 06:35
Mohit
It is a job level config parameter. For plain map reduce jobs you can set the same through CLI as hadoop jar ... -D mapred.map.tasks=n You should be able to do it pig as well. However the number of map tasks for a job are governed by the input splits and the Input Format you are using. So setting this config parameter doesn't guarantee that your job would have the specified number of map tasks. Normally you set the number of reduce tasks this way for your job, mapred.reduce.tasks=n Hope it helps Regards Bejoy K S From handheld, Please excuse typos. -----Original Message----- From: Mohit Anchlia <[EMAIL PROTECTED]> Date: Fri, 9 Mar 2012 20:34:33 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: mapred.map.tasks vs mapred.tasktracker.map.tasks.maximum Is this system parameter too? Or can I specify as mapred.map.tasks? I am using pig. On Fri, Mar 9, 2012 at 6:19 PM, Chen He <[EMAIL PROTECTED]> wrote: > if you do not specify setNumMapTasks, by default, system will use the > number you configured for "mapred.map.tasks" in the conf/mapred-site.xml > file. > > On Fri, Mar 9, 2012 at 7:19 PM, Mohit Anchlia <[EMAIL PROTECTED] > >wrote: > > > What's the difference between setNumMapTasks and mapred.map.tasks? > > > > On Fri, Mar 9, 2012 at 5:00 PM, Chen He <[EMAIL PROTECTED]> wrote: > > > > > Hi Mohit > > > > > > " mapred.tasktracker.reduce(map).tasks.maximum " means how many > > reduce(map) > > > slot(s) you can have on each tasktracker. > > > > > > "mapred.job.reduce(maps)" means default number of reduce (map) tasks > your > > > job will has. > > > > > > To set the number of mappers in your application. You can write like > > this: > > > > > > *configuration.setNumMapTasks(the number you want);* > > > > > > Chen > > > > > > Actually, you can just use configuration.set() > > > > > > On Fri, Mar 9, 2012 at 6:42 PM, Mohit Anchlia <[EMAIL PROTECTED] > > > >wrote: > > > > > > > What's the difference between mapred.tasktracker.reduce.tasks.maximum > > and > > > > mapred.map.tasks > > > > ** > > > > I want my data to be split against only 10 mappers in the entire > > > cluster. > > > > Can I do that using one of the above parameters? > > > > > > > > > > |