Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Dynamically set mapred.tasktracker.map.tasks.maximum from inside a job.


Copy link to this message
-
Re: Dynamically set mapred.tasktracker.map.tasks.maximum from inside a job.
Hi Pierre,

The "setNumReduceTasks" method is for setting the number of reduce tasks to
launch, it's equal to set the "mapred.reduce.tasks" parameter, while the
"mapred.tasktracker.reduce.tasks.maximum" parameter decides the number of
tasks running *concurrently* on one node.
And as Amareshwari mentioned, the
"mapred.tasktracker.map/reduce.tasks.maximum" is a cluster configuration
which could not be set per job. If you set
mapred.tasktracker.map.tasks.maximum to 20, and the overall number of map
tasks is larger than 20*<nodes number>, there would be 20 map tasks running
concurrently on a node. As I know, you probably need to restart the
tasktracker if you truely need to change the configuration.

Best Regards,
Carp

2010/6/30 Pierre ANCELOT <[EMAIL PROTECTED]>

> Sure, but not the number of tasks running concurrently on a node at the
> same
> time.
>
>
>
> On Wed, Jun 30, 2010 at 1:57 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > The number of map tasks is determined by InputSplit.
> >
> > On Wednesday, June 30, 2010, Pierre ANCELOT <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > > Okay, so, if I set the 20 by default, I could maybe limit the number of
> > > concurrent maps per node instead?
> > > job.setNumReduceTasks exists but I see no equivalent for maps, though I
> > > think there was a setNumMapTasks before...
> > > Was it removed? Why?
> > > Any idea about how to acheive this?
> > >
> > > Thank you.
> > >
> > >
> > > On Wed, Jun 30, 2010 at 12:08 PM, Amareshwari Sri Ramadasu <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > >> Hi Pierre,
> > >>
> > >> "mapred.tasktracker.map.tasks.maximum" is a cluster level
> configuration,
> > >> cannot be set per job. It is loaded only while bringing up the
> > TaskTracker.
> > >>
> > >> Thanks
> > >> Amareshwari
> > >>
> > >> On 6/30/10 3:05 PM, "Pierre ANCELOT" <[EMAIL PROTECTED]> wrote:
> > >>
> > >> Hi everyone :)
> > >> There's something I'm probably doing wrong but I can't seem to figure
> > out
> > >> what.
> > >> I have two hadoop programs running one after the other.
> > >> This is done because they don't have the same needs in term of
> processor
> > in
> > >> memory, so by separating them I optimize each task better.
> > >> Fact is, I need for the first job on every node
> > >> mapred.tasktracker.map.tasks.maximum set to 12.
> > >> For the second task, I need it to be set to 20.
> > >> so by default I set it to 12 and in the second job's code, I set this:
> > >>
> > >>        Configuration hadoopConfiguration = new Configuration();
> > >>
> >  hadoopConfiguration.setInt("mapred.tasktracker.map.tasks.maximum",
> > >> 20);
> > >>
> > >> But when running the job, instead of having the 20 tasks on each node
> as
> > >> expected, I have 12....
> > >> Any idea please?
> > >>
> > >> Thank you.
> > >> Pierre.
> > >>
> > >>
> > >> --
> > >> http://www.neko-consulting.com
> > >> Ego sum quis ego servo
> > >> "Je suis ce que je protège"
> > >> "I am what I protect"
> > >>
> > >>
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
>
>
>
> --
>  http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>