Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Re: Is there a good way to see how full hdfs is


Copy link to this message
-
Re: Is there a good way to see how full hdfs is
Rajiv Chittajallu 2011-10-18, 01:04
If you are running > 0.20.204
http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo
[EMAIL PROTECTED] wrote on 10/17/11 at 09:18:20 -0700:
>Hi Harsh,
>
>I need access to the data programatically for system automation, and hence
>I do not want a monitoring tool but access to the raw data.
>
>I am more than happy to use an exposed function or client program and not
>an internal API.
>
>So i am still a bit confused... What is the simplest way to get at this
>raw disk usage data programmatically?  Is there a HDFS equivalent of du
>and df, or are you suggesting to just run that on the linux OS (which is
>perfectly doable).
>
>Cheers,
>Ivan
>
>
>On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote:
>
>>Uma/Ivan,
>>
>>The DistributedFileSystem class explicitly is _not_ meant for public
>>consumption, it is an internal one. Additionally, that method has been
>>deprecated.
>>
>>What you need is FileSystem#getStatus() if you want the summarized
>>report via code.
>>
>>A job, that possibly runs "du" or "df", is a good idea if you
>>guarantee perfect homogeneity of path names in your cluster.
>>
>>But I wonder, why won't using a general monitoring tool (such as
>>nagios) for this purpose cut it? What's the end goal here?
>>
>>P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I
>>see it being cross posted into mr-user, common-user, and common-dev --
>>Why?
>>
>>On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686
>><[EMAIL PROTECTED]> wrote:
>>> We can write the simple program and you can call this API.
>>>
>>> Make sure Hadoop jars presents in your class path.
>>> Just for more clarification, DN will send their stats as parts of
>>>hertbeats, So, NN will maintain all the statistics about the diskspace
>>>usage for the complete filesystem and etc... This api will give you that
>>>stats.
>>>
>>> Regards,
>>> Uma
>>>
>>> ----- Original Message -----
>>> From: [EMAIL PROTECTED]
>>> Date: Monday, October 17, 2011 9:07 pm
>>> Subject: Re: Is there a good way to see how full hdfs is
>>> To: [EMAIL PROTECTED], [EMAIL PROTECTED]
>>> Cc: [EMAIL PROTECTED]
>>>
>>>> So is there a client program to call this?
>>>>
>>>> Can one write their own simple client to call this method from all
>>>> diskson the cluster?
>>>>
>>>> How about a map reduce job to collect from all disks on the cluster?
>>>>
>>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686"
>>>> <[EMAIL PROTECTED]>wrote:
>>>>
>>>> >/** Return the disk usage of the filesystem, including total
>>>> capacity,>   * used space, and remaining space */
>>>> >  public DiskStatus getDiskStatus() throws IOException {
>>>> >    return dfs.getDiskStatus();
>>>> >  }
>>>> >
>>>> >DistributedFileSystem has the above API from java API side.
>>>> >
>>>> >Regards,
>>>> >Uma
>>>> >
>>>> >----- Original Message -----
>>>> >From: wd <[EMAIL PROTECTED]>
>>>> >Date: Saturday, October 15, 2011 4:16 pm
>>>> >Subject: Re: Is there a good way to see how full hdfs is
>>>> >To: [EMAIL PROTECTED]
>>>> >
>>>> >> hadoop dfsadmin -report
>>>> >>
>>>> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis
>>>> >> <[EMAIL PROTECTED]> wrote:
>>>> >> > We have a small cluster with HDFS running on only 8 nodes - I
>>>> >> believe that
>>>> >> > the partition assigned to hdfs might be getting full and
>>>> >> > wonder if the web tools or java api havew a way to look at free
>>>> >> space on
>>>> >> > hdfs
>>>> >> >
>>>> >> > --
>>>> >> > Steven M. Lewis PhD
>>>> >> > 4221 105th Ave NE
>>>> >> > Kirkland, WA 98033
>>>> >> > 206-384-1340 (cell)
>>>> >> > Skype lordjoe_com
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >>
>>>> >
>>>>
>>>>
>>>
>>
>>
>>
>>--
>>Harsh J
>>
>