Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Help: How to increase amont maptasks per job ?


Copy link to this message
-
Re: Help: How to increase amont maptasks per job ?
Check out mapred.map.tasks and mapred.reduce.tasks

On Fri, Jan 7, 2011 at 1:40 PM, Tali K <[EMAIL PROTECTED]> wrote:

>
> According to the documentation, that parameter is for the number of
>    tasks *per TaskTracker*.  I am asking about the number of tasks
>    for the entire job and entire cluster.  That parameter is already
>    set to 3, which is one less than the number of cores on each node's
>    CPU, as recommended.In my question I stated   that
>    82 tasks were run for the first job, yet only 4 for the second -
>    both numbers being cluster-wide.
>
>
>
> > Date: Fri, 7 Jan 2011 13:19:42 -0800
> > Subject: Re: Help: How to increase amont maptasks per job ?
> > From: [EMAIL PROTECTED]
> > To: [EMAIL PROTECTED]
> >
> > Set higher values for mapred.tasktracker.map.tasks.maximum (and
> > mapred.tasktracker.reduce.tasks.maximum) in mapred-site.xml
> >
> > On Fri, Jan 7, 2011 at 12:58 PM, Tali K <[EMAIL PROTECTED]> wrote:
> >
> > >
> > >
> > >
> > >
> > > We have a jobs which runs in several map/reduce stages.  In the first
> job,
> > > a large number of map tasks -82  are initiated, as expected.
> > > And that cause all nodes to be used.
> > >  In a
> > > later job, where we are still dealing with large amounts of
> > >  data, only 4 map tasks are initiated, and that caused to use only 4
> nodes.
> > > This stage is actually the
> > > workhorse of the job, and requires much more processing power than the
> > > initial stage.
> > >  We are trying to understand why only a few map tasks are
> > > being used, as we are not getting the full advantage of our cluster.
> > >
> > >
> > >
> > >
>
>