Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: Is there a good way to see how full hdfs is

Ivan.Novick@... 2011-10-17, 15:38
Harsh J 2011-10-17, 16:05
Ivan.Novick@... 2011-10-17, 16:18
Uma Maheswara Rao G 72686... 2011-10-17, 16:56
Rajiv Chittajallu 2011-10-18, 01:04
Ivan.Novick@... 2011-10-18, 16:23
Copy link to this message
Re: Is there a good way to see how full hdfs is
[EMAIL PROTECTED] wrote on 10/18/11 at 09:23:50 -0700:
>Cool, is there any documentation on how to use the JMX stuff to get
>monitoring data?

I don't know if there is any specific documentation. These are the
mbeans you might be interested in







You may also want to monitor shuffle_exceptions_caught in

>On 10/17/11 6:04 PM, "Rajiv Chittajallu" <[EMAIL PROTECTED]> wrote:
>>If you are running > 0.20.204
>>[EMAIL PROTECTED] wrote on 10/17/11 at 09:18:20 -0700:
>>>Hi Harsh,
>>>I need access to the data programatically for system automation, and
>>>I do not want a monitoring tool but access to the raw data.
>>>I am more than happy to use an exposed function or client program and not
>>>an internal API.
>>>So i am still a bit confused... What is the simplest way to get at this
>>>raw disk usage data programmatically?  Is there a HDFS equivalent of du
>>>and df, or are you suggesting to just run that on the linux OS (which is
>>>perfectly doable).
>>>On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote:
>>>>The DistributedFileSystem class explicitly is _not_ meant for public
>>>>consumption, it is an internal one. Additionally, that method has been
>>>>What you need is FileSystem#getStatus() if you want the summarized
>>>>report via code.
>>>>A job, that possibly runs "du" or "df", is a good idea if you
>>>>guarantee perfect homogeneity of path names in your cluster.
>>>>But I wonder, why won't using a general monitoring tool (such as
>>>>nagios) for this purpose cut it? What's the end goal here?
>>>>P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I
>>>>see it being cross posted into mr-user, common-user, and common-dev --
>>>>On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686
>>>><[EMAIL PROTECTED]> wrote:
>>>>> We can write the simple program and you can call this API.
>>>>> Make sure Hadoop jars presents in your class path.
>>>>> Just for more clarification, DN will send their stats as parts of
>>>>>hertbeats, So, NN will maintain all the statistics about the diskspace
>>>>>usage for the complete filesystem and etc... This api will give you
>>>>> Regards,
>>>>> Uma
>>>>> ----- Original Message -----
>>>>> Date: Monday, October 17, 2011 9:07 pm
>>>>> Subject: Re: Is there a good way to see how full hdfs is
>>>>>> So is there a client program to call this?
>>>>>> Can one write their own simple client to call this method from all
>>>>>> diskson the cluster?
>>>>>> How about a map reduce job to collect from all disks on the cluster?
>>>>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686"
>>>>>> <[EMAIL PROTECTED]>wrote:
>>>>>> >/** Return the disk usage of the filesystem, including total
>>>>>> capacity,>   * used space, and remaining space */
>>>>>> >  public DiskStatus getDiskStatus() throws IOException {
>>>>>> >    return dfs.getDiskStatus();
>>>>>> >  }
>>>>>> >
>>>>>> >DistributedFileSystem has the above API from java API side.
>>>>>> >
>>>>>> >Regards,
>>>>>> >Uma
>>>>>> >
>>>>>> >----- Original Message -----
>>>>>> >From: wd <[EMAIL PROTECTED]>
>>>>>> >Date: Saturday, October 15, 2011 4:16 pm
Mapred Learn 2011-10-20, 14:31