Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> Hadoop cluster/monitoring

Copy link to this message
Re: Hadoop cluster/monitoring

On Wed, Aug 8, 2012 at 10:52 PM, Nagaraju Bingi
> Hi,
> I'm beginner in Hadoop concepts. I have few basic questions:
> 1) looking for APIs to retrieve the capacity of the cluster. so that i can write a script to when to add a new slave node to the cluster
>              a) No.of Task trackers and  capacity of  each task tracker  to spawn  max No.of Mappers

For this, see: http://hadoop.apache.org/common/docs/stable/api/org/apache/hadoop/mapred/ClusterStatus.html

>               b) CPU,RAM and disk capacity of each tracker

Rely on other tools to provide this one. Tools such as Ganglia and
Nagios can report this, for instance.

>               c) how to decide to add a new  slave node to the cluster

This is highly dependent on the workload that is required out of your clusters.

>  2) what is the API to retrieve metrics like current usage of resources and currently running/spawned Mappers/Reducers

See 1.a. for some, and 1.b for some more.

>  3) what is the purpose of Hadoop-common?Is it API to interact with hadoop

Hadoop Common encapsulates the utilities shared by both of the other
sub-projects - MapReduce and HDFS. Among other things, it does provide
a general interaction API for all things 'Hadoop'

Harsh J