Hadoop, mail # user - check namenode, jobtracker, datanodes and tasktracker status

check namenode, jobtracker, datanodes and tasktracker status
Marc Sturlese 2011-07-08, 17:49
Hey there,
I've written some scripts to check dfs disk space, number of datanodes,
number of tasktrackers, heap in use...
I'm with hadoop 0.20.2 and to do that I use the DFSClient and JobClient
I do things like:

JobClient jc = new JobClient(socketJT, conf);
ClusterStatus clusterStatus = jc.getClusterStatus(true);
DFSClient client = new DFSClient(socketNN, conf);
DatanodeInfo[] dni = client.datanodeReport(DatanodeReportType.ALL);

FileSystem fs = FileSystem.get(new URI("hdfs://" + host + "/"), conf);

It's is working well but I'm worried it could be harmful for the cluster to
run the script continuously (resource consumer). Is it alrite for example to
run it every 10 o 15 minutes? In case not, which is a good practice to
monitor the cluster?

Thanks in advance.
