On Wed, Aug 8, 2012 at 10:52 PM, Nagaraju Bingi
<[EMAIL PROTECTED]> wrote:
> I'm beginner in Hadoop concepts. I have few basic questions:
> 1) looking for APIs to retrieve the capacity of the cluster. so that i can write a script to when to add a new slave node to the cluster
> a) No.of Task trackers and capacity of each task tracker to spawn max No.of Mappers
For this, see: http://hadoop.apache.org/common/docs/stable/api/org/apache/hadoop/mapred/ClusterStatus.html
> b) CPU,RAM and disk capacity of each tracker
Rely on other tools to provide this one. Tools such as Ganglia and
Nagios can report this, for instance.
> c) how to decide to add a new slave node to the cluster
This is highly dependent on the workload that is required out of your clusters.
> 2) what is the API to retrieve metrics like current usage of resources and currently running/spawned Mappers/Reducers
See 1.a. for some, and 1.b for some more.
> 3) what is the purpose of Hadoop-common?Is it API to interact with hadoop
Hadoop Common encapsulates the utilities shared by both of the other
sub-projects - MapReduce and HDFS. Among other things, it does provide
a general interaction API for all things 'Hadoop'