Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: Is there a good way to see how full hdfs is


+
Ivan.Novick@... 2011-10-17, 15:38
+
Harsh J 2011-10-17, 16:05
+
Ivan.Novick@... 2011-10-17, 16:18
+
Uma Maheswara Rao G 72686... 2011-10-17, 16:56
+
Rajiv Chittajallu 2011-10-18, 01:04
+
Ivan.Novick@... 2011-10-18, 16:23
Copy link to this message
-
Re: Is there a good way to see how full hdfs is
[EMAIL PROTECTED] wrote on 10/18/11 at 09:23:50 -0700:
>Cool, is there any documentation on how to use the JMX stuff to get
>monitoring data?

I don't know if there is any specific documentation. These are the
mbeans you might be interested in

Namenode:

Hadoop:service=NameNode,name=FSNamesystemState
Hadoop:service=NameNode,name=NameNodeInfo
Hadoop:service=NameNode,name=jvm

JobTracker:

Hadoop:service=JobTracker,name=JobTrackerInfo
Hadoop:service=JobTracker,name=QueueMetrics,q=<queuename>
Hadoop:service=JobTracker,name=jvm

DataNode:
Hadoop:name=DataNodeInfo,service=DataNode

TaskTracker:
Hadoop:service=TaskTracker,name=TaskTrackerInfo

You may also want to monitor shuffle_exceptions_caught in
Hadoop:service=TaskTracker,name=ShuffleServerMetrics

>
>Cheers,
>Ivan
>
>On 10/17/11 6:04 PM, "Rajiv Chittajallu" <[EMAIL PROTECTED]> wrote:
>
>>If you are running > 0.20.204
>>http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode,
>>name=NameNodeInfo
>>
>>
>>[EMAIL PROTECTED] wrote on 10/17/11 at 09:18:20 -0700:
>>>Hi Harsh,
>>>
>>>I need access to the data programatically for system automation, and
>>>hence
>>>I do not want a monitoring tool but access to the raw data.
>>>
>>>I am more than happy to use an exposed function or client program and not
>>>an internal API.
>>>
>>>So i am still a bit confused... What is the simplest way to get at this
>>>raw disk usage data programmatically?  Is there a HDFS equivalent of du
>>>and df, or are you suggesting to just run that on the linux OS (which is
>>>perfectly doable).
>>>
>>>Cheers,
>>>Ivan
>>>
>>>
>>>On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote:
>>>
>>>>Uma/Ivan,
>>>>
>>>>The DistributedFileSystem class explicitly is _not_ meant for public
>>>>consumption, it is an internal one. Additionally, that method has been
>>>>deprecated.
>>>>
>>>>What you need is FileSystem#getStatus() if you want the summarized
>>>>report via code.
>>>>
>>>>A job, that possibly runs "du" or "df", is a good idea if you
>>>>guarantee perfect homogeneity of path names in your cluster.
>>>>
>>>>But I wonder, why won't using a general monitoring tool (such as
>>>>nagios) for this purpose cut it? What's the end goal here?
>>>>
>>>>P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I
>>>>see it being cross posted into mr-user, common-user, and common-dev --
>>>>Why?
>>>>
>>>>On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686
>>>><[EMAIL PROTECTED]> wrote:
>>>>> We can write the simple program and you can call this API.
>>>>>
>>>>> Make sure Hadoop jars presents in your class path.
>>>>> Just for more clarification, DN will send their stats as parts of
>>>>>hertbeats, So, NN will maintain all the statistics about the diskspace
>>>>>usage for the complete filesystem and etc... This api will give you
>>>>>that
>>>>>stats.
>>>>>
>>>>> Regards,
>>>>> Uma
>>>>>
>>>>> ----- Original Message -----
>>>>> From: [EMAIL PROTECTED]
>>>>> Date: Monday, October 17, 2011 9:07 pm
>>>>> Subject: Re: Is there a good way to see how full hdfs is
>>>>> To: [EMAIL PROTECTED], [EMAIL PROTECTED]
>>>>> Cc: [EMAIL PROTECTED]
>>>>>
>>>>>> So is there a client program to call this?
>>>>>>
>>>>>> Can one write their own simple client to call this method from all
>>>>>> diskson the cluster?
>>>>>>
>>>>>> How about a map reduce job to collect from all disks on the cluster?
>>>>>>
>>>>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686"
>>>>>> <[EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>> >/** Return the disk usage of the filesystem, including total
>>>>>> capacity,>   * used space, and remaining space */
>>>>>> >  public DiskStatus getDiskStatus() throws IOException {
>>>>>> >    return dfs.getDiskStatus();
>>>>>> >  }
>>>>>> >
>>>>>> >DistributedFileSystem has the above API from java API side.
>>>>>> >
>>>>>> >Regards,
>>>>>> >Uma
>>>>>> >
>>>>>> >----- Original Message -----
>>>>>> >From: wd <[EMAIL PROTECTED]>
>>>>>> >Date: Saturday, October 15, 2011 4:16 pm
+
Mapred Learn 2011-10-20, 14:31
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB