Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Help: How to increase amont maptasks per job ?


+
Tali K 2011-01-07, 20:58
+
Ted Yu 2011-01-07, 21:19
Copy link to this message
-
Re: Help: How to increase amont maptasks per job ?
Also make sure you've enough input files for the next stage mappers to work
with...

Read thru the input splits part of tutorial:
http://wiki.apache.org/hadoop/HadoopMapReduce

If the last stage had only 4 reducers running, they'd generate 4 output
files. This will limit the # of mappers started in the next stage to 4,
unless you tune your input split parameters or write a custom input split.

Hope this helps, there is lot more literature on this on the web and hadoop
books released till date.

-Rahul
On Fri, Jan 7, 2011 at 1:19 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Set higher values for mapred.tasktracker.map.tasks.maximum (and
> mapred.tasktracker.reduce.tasks.maximum) in mapred-site.xml
>
> On Fri, Jan 7, 2011 at 12:58 PM, Tali K <[EMAIL PROTECTED]> wrote:
>
> >
> >
> >
> >
> > We have a jobs which runs in several map/reduce stages.  In the first
> job,
> > a large number of map tasks -82  are initiated, as expected.
> > And that cause all nodes to be used.
> >  In a
> > later job, where we are still dealing with large amounts of
> >  data, only 4 map tasks are initiated, and that caused to use only 4
> nodes.
> > This stage is actually the
> > workhorse of the job, and requires much more processing power than the
> > initial stage.
> >  We are trying to understand why only a few map tasks are
> > being used, as we are not getting the full advantage of our cluster.
> >
> >
> >
> >
>
+
Tali K 2011-01-07, 21:40
+
Niels Basjes 2011-01-07, 21:44
+
Ted Yu 2011-01-07, 21:47
+
Harsh J 2011-01-08, 04:12
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB