-Re: Obtaining the number of map slots through the API (Hadoop 0.20.2)
Hemanth Yamijala 2010-09-05, 09:42
> The optimization of one Hadoop job I'm running would benefit from knowing
> maximum number of map slots in the Hadoop cluster.
> This number can be obtained (if my understanding is correct) by:
> * parsing the mapred-site.xml file to get
> the mapred.tasktracker.map.tasks.maximum value (assuming it is set of
> * parsing the slaves file to get the maximum number of compute nodes in the
> * multiplying the 2 values
> My question is:
> I would like to learn about *all* possible ways to get this information
> through API calls (either the Hadoop Common API or the Hadoop MapReduce
> API), i.e. obtaining it through a Job object, through a Configuration
The easiest way I can think of is using
o.a.h.m.ClusterStatus.getMaxMapTasks(). You can get an instance to
ClusterStatus using JobClient.getClusterStatus().