Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Help: How to increase amont maptasks per job ?


Copy link to this message
-
Re: Help: How to increase amont maptasks per job ?
Also make sure you've enough input files for the next stage mappers to work
with...

Read thru the input splits part of tutorial:
http://wiki.apache.org/hadoop/HadoopMapReduce

If the last stage had only 4 reducers running, they'd generate 4 output
files. This will limit the # of mappers started in the next stage to 4,
unless you tune your input split parameters or write a custom input split.

Hope this helps, there is lot more literature on this on the web and hadoop
books released till date.

-Rahul
On Fri, Jan 7, 2011 at 1:19 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Set higher values for mapred.tasktracker.map.tasks.maximum (and
> mapred.tasktracker.reduce.tasks.maximum) in mapred-site.xml
>
> On Fri, Jan 7, 2011 at 12:58 PM, Tali K <[EMAIL PROTECTED]> wrote:
>
> >
> >
> >
> >
> > We have a jobs which runs in several map/reduce stages.  In the first
> job,
> > a large number of map tasks -82  are initiated, as expected.
> > And that cause all nodes to be used.
> >  In a
> > later job, where we are still dealing with large amounts of
> >  data, only 4 map tasks are initiated, and that caused to use only 4
> nodes.
> > This stage is actually the
> > workhorse of the job, and requires much more processing power than the
> > initial stage.
> >  We are trying to understand why only a few map tasks are
> > being used, as we are not getting the full advantage of our cluster.
> >
> >
> >
> >
>