|
|
-
Re: Is there a good way to see how full hdfs is
Ivan.Novick@... 2011-10-17, 15:38
So is there a client program to call this?
Can one write their own simple client to call this method from all disks on the cluster?
How about a map reduce job to collect from all disks on the cluster?
On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" <[EMAIL PROTECTED]> wrote:
>/** Return the disk usage of the filesystem, including total capacity, > * used space, and remaining space */ > public DiskStatus getDiskStatus() throws IOException { > return dfs.getDiskStatus(); > } > >DistributedFileSystem has the above API from java API side. > >Regards, >Uma > >----- Original Message ----- >From: wd <[EMAIL PROTECTED]> >Date: Saturday, October 15, 2011 4:16 pm >Subject: Re: Is there a good way to see how full hdfs is >To: [EMAIL PROTECTED] > >> hadoop dfsadmin -report >> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis >> <[EMAIL PROTECTED]> wrote: >> > We have a small cluster with HDFS running on only 8 nodes - I >> believe that >> > the partition assigned to hdfs might be getting full and >> > wonder if the web tools or java api havew a way to look at free >> space on >> > hdfs >> > >> > -- >> > Steven M. Lewis PhD >> > 4221 105th Ave NE >> > Kirkland, WA 98033 >> > 206-384-1340 (cell) >> > Skype lordjoe_com >> > >> > >> > >> >
+
Ivan.Novick@... 2011-10-17, 15:38
-
Re: Is there a good way to see how full hdfs is
Harsh J 2011-10-17, 16:05
Uma/Ivan,
The DistributedFileSystem class explicitly is _not_ meant for public consumption, it is an internal one. Additionally, that method has been deprecated.
What you need is FileSystem#getStatus() if you want the summarized report via code.
A job, that possibly runs "du" or "df", is a good idea if you guarantee perfect homogeneity of path names in your cluster.
But I wonder, why won't using a general monitoring tool (such as nagios) for this purpose cut it? What's the end goal here?
P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I see it being cross posted into mr-user, common-user, and common-dev -- Why?
On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 <[EMAIL PROTECTED]> wrote: > We can write the simple program and you can call this API. > > Make sure Hadoop jars presents in your class path. > Just for more clarification, DN will send their stats as parts of hertbeats, So, NN will maintain all the statistics about the diskspace usage for the complete filesystem and etc... This api will give you that stats. > > Regards, > Uma > > ----- Original Message ----- > From: [EMAIL PROTECTED] > Date: Monday, October 17, 2011 9:07 pm > Subject: Re: Is there a good way to see how full hdfs is > To: [EMAIL PROTECTED], [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > >> So is there a client program to call this? >> >> Can one write their own simple client to call this method from all >> diskson the cluster? >> >> How about a map reduce job to collect from all disks on the cluster? >> >> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" >> <[EMAIL PROTECTED]>wrote: >> >> >/** Return the disk usage of the filesystem, including total >> capacity,> * used space, and remaining space */ >> > public DiskStatus getDiskStatus() throws IOException { >> > return dfs.getDiskStatus(); >> > } >> > >> >DistributedFileSystem has the above API from java API side. >> > >> >Regards, >> >Uma >> > >> >----- Original Message ----- >> >From: wd <[EMAIL PROTECTED]> >> >Date: Saturday, October 15, 2011 4:16 pm >> >Subject: Re: Is there a good way to see how full hdfs is >> >To: [EMAIL PROTECTED] >> > >> >> hadoop dfsadmin -report >> >> >> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis >> >> <[EMAIL PROTECTED]> wrote: >> >> > We have a small cluster with HDFS running on only 8 nodes - I >> >> believe that >> >> > the partition assigned to hdfs might be getting full and >> >> > wonder if the web tools or java api havew a way to look at free >> >> space on >> >> > hdfs >> >> > >> >> > -- >> >> > Steven M. Lewis PhD >> >> > 4221 105th Ave NE >> >> > Kirkland, WA 98033 >> >> > 206-384-1340 (cell) >> >> > Skype lordjoe_com >> >> > >> >> > >> >> > >> >> >> > >> >> >
-- Harsh J
+
Harsh J 2011-10-17, 16:05
-
Re: Is there a good way to see how full hdfs is
Ivan.Novick@... 2011-10-17, 16:18
Hi Harsh,
I need access to the data programatically for system automation, and hence I do not want a monitoring tool but access to the raw data.
I am more than happy to use an exposed function or client program and not an internal API.
So i am still a bit confused... What is the simplest way to get at this raw disk usage data programmatically? Is there a HDFS equivalent of du and df, or are you suggesting to just run that on the linux OS (which is perfectly doable).
Cheers, Ivan On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote:
>Uma/Ivan, > >The DistributedFileSystem class explicitly is _not_ meant for public >consumption, it is an internal one. Additionally, that method has been >deprecated. > >What you need is FileSystem#getStatus() if you want the summarized >report via code. > >A job, that possibly runs "du" or "df", is a good idea if you >guarantee perfect homogeneity of path names in your cluster. > >But I wonder, why won't using a general monitoring tool (such as >nagios) for this purpose cut it? What's the end goal here? > >P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I >see it being cross posted into mr-user, common-user, and common-dev -- >Why? > >On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 ><[EMAIL PROTECTED]> wrote: >> We can write the simple program and you can call this API. >> >> Make sure Hadoop jars presents in your class path. >> Just for more clarification, DN will send their stats as parts of >>hertbeats, So, NN will maintain all the statistics about the diskspace >>usage for the complete filesystem and etc... This api will give you that >>stats. >> >> Regards, >> Uma >> >> ----- Original Message ----- >> From: [EMAIL PROTECTED] >> Date: Monday, October 17, 2011 9:07 pm >> Subject: Re: Is there a good way to see how full hdfs is >> To: [EMAIL PROTECTED], [EMAIL PROTECTED] >> Cc: [EMAIL PROTECTED] >> >>> So is there a client program to call this? >>> >>> Can one write their own simple client to call this method from all >>> diskson the cluster? >>> >>> How about a map reduce job to collect from all disks on the cluster? >>> >>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" >>> <[EMAIL PROTECTED]>wrote: >>> >>> >/** Return the disk usage of the filesystem, including total >>> capacity,> * used space, and remaining space */ >>> > public DiskStatus getDiskStatus() throws IOException { >>> > return dfs.getDiskStatus(); >>> > } >>> > >>> >DistributedFileSystem has the above API from java API side. >>> > >>> >Regards, >>> >Uma >>> > >>> >----- Original Message ----- >>> >From: wd <[EMAIL PROTECTED]> >>> >Date: Saturday, October 15, 2011 4:16 pm >>> >Subject: Re: Is there a good way to see how full hdfs is >>> >To: [EMAIL PROTECTED] >>> > >>> >> hadoop dfsadmin -report >>> >> >>> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis >>> >> <[EMAIL PROTECTED]> wrote: >>> >> > We have a small cluster with HDFS running on only 8 nodes - I >>> >> believe that >>> >> > the partition assigned to hdfs might be getting full and >>> >> > wonder if the web tools or java api havew a way to look at free >>> >> space on >>> >> > hdfs >>> >> > >>> >> > -- >>> >> > Steven M. Lewis PhD >>> >> > 4221 105th Ave NE >>> >> > Kirkland, WA 98033 >>> >> > 206-384-1340 (cell) >>> >> > Skype lordjoe_com >>> >> > >>> >> > >>> >> > >>> >> >>> > >>> >>> >> > > > >-- >Harsh J >
+
Ivan.Novick@... 2011-10-17, 16:18
-
Re: Is there a good way to see how full hdfs is
Uma Maheswara Rao G 72686... 2011-10-17, 16:56
Yes, that was deprecated in trunk
If you want to use by programatically, this will be the better option as well. /** {@inheritDoc} */ @Override public FsStatus getStatus(Path p) throws IOException { statistics.incrementReadOps(1); return dfs.getDiskStatus(); }
This should work for you.
It will give you FileStatus object contains below APIs getCapacity, getUsed, getRemaining
I would suggest you to look at the FileSystem APIs available once. I think you will get clear understanding to use.
Regards, Uma ----- Original Message ----- From: [EMAIL PROTECTED] Date: Monday, October 17, 2011 9:48 pm Subject: Re: Is there a good way to see how full hdfs is To: [EMAIL PROTECTED]
> Hi Harsh, > > I need access to the data programatically for system automation, > and hence > I do not want a monitoring tool but access to the raw data. > > I am more than happy to use an exposed function or client program > and not > an internal API. > > So i am still a bit confused... What is the simplest way to get at > thisraw disk usage data programmatically? Is there a HDFS > equivalent of du > and df, or are you suggesting to just run that on the linux OS > (which is > perfectly doable). > > Cheers, > Ivan > > > On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote: > > >Uma/Ivan, > > > >The DistributedFileSystem class explicitly is _not_ meant for public > >consumption, it is an internal one. Additionally, that method has > been>deprecated. > > > >What you need is FileSystem#getStatus() if you want the summarized > >report via code. > > > >A job, that possibly runs "du" or "df", is a good idea if you > >guarantee perfect homogeneity of path names in your cluster. > > > >But I wonder, why won't using a general monitoring tool (such as > >nagios) for this purpose cut it? What's the end goal here? > > > >P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I > >see it being cross posted into mr-user, common-user, and common- > dev -- > >Why? > > > >On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 > ><[EMAIL PROTECTED]> wrote: > >> We can write the simple program and you can call this API. > >> > >> Make sure Hadoop jars presents in your class path. > >> Just for more clarification, DN will send their stats as parts of > >>hertbeats, So, NN will maintain all the statistics about the > diskspace>>usage for the complete filesystem and etc... This api > will give you that > >>stats. > >> > >> Regards, > >> Uma > >> > >> ----- Original Message ----- > >> From: [EMAIL PROTECTED] > >> Date: Monday, October 17, 2011 9:07 pm > >> Subject: Re: Is there a good way to see how full hdfs is > >> To: [EMAIL PROTECTED], [EMAIL PROTECTED] > >> Cc: [EMAIL PROTECTED] > >> > >>> So is there a client program to call this? > >>> > >>> Can one write their own simple client to call this method from all > >>> diskson the cluster? > >>> > >>> How about a map reduce job to collect from all disks on the > cluster?>>> > >>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" > >>> <[EMAIL PROTECTED]>wrote: > >>> > >>> >/** Return the disk usage of the filesystem, including total > >>> capacity,> * used space, and remaining space */ > >>> > public DiskStatus getDiskStatus() throws IOException { > >>> > return dfs.getDiskStatus(); > >>> > } > >>> > > >>> >DistributedFileSystem has the above API from java API side. > >>> > > >>> >Regards, > >>> >Uma > >>> > > >>> >----- Original Message ----- > >>> >From: wd <[EMAIL PROTECTED]> > >>> >Date: Saturday, October 15, 2011 4:16 pm > >>> >Subject: Re: Is there a good way to see how full hdfs is > >>> >To: [EMAIL PROTECTED] > >>> > > >>> >> hadoop dfsadmin -report > >>> >> > >>> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis > >>> >> <[EMAIL PROTECTED]> wrote: > >>> >> > We have a small cluster with HDFS running on only 8 nodes - > I > >>> >> believe that > >>> >> > the partition assigned to hdfs might be getting full and
+
Uma Maheswara Rao G 72686... 2011-10-17, 16:56
-
Re: Is there a good way to see how full hdfs is
Rajiv Chittajallu 2011-10-18, 01:04
If you are running > 0.20.204 http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeInfo[EMAIL PROTECTED] wrote on 10/17/11 at 09:18:20 -0700: >Hi Harsh, > >I need access to the data programatically for system automation, and hence >I do not want a monitoring tool but access to the raw data. > >I am more than happy to use an exposed function or client program and not >an internal API. > >So i am still a bit confused... What is the simplest way to get at this >raw disk usage data programmatically? Is there a HDFS equivalent of du >and df, or are you suggesting to just run that on the linux OS (which is >perfectly doable). > >Cheers, >Ivan > > >On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote: > >>Uma/Ivan, >> >>The DistributedFileSystem class explicitly is _not_ meant for public >>consumption, it is an internal one. Additionally, that method has been >>deprecated. >> >>What you need is FileSystem#getStatus() if you want the summarized >>report via code. >> >>A job, that possibly runs "du" or "df", is a good idea if you >>guarantee perfect homogeneity of path names in your cluster. >> >>But I wonder, why won't using a general monitoring tool (such as >>nagios) for this purpose cut it? What's the end goal here? >> >>P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I >>see it being cross posted into mr-user, common-user, and common-dev -- >>Why? >> >>On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 >><[EMAIL PROTECTED]> wrote: >>> We can write the simple program and you can call this API. >>> >>> Make sure Hadoop jars presents in your class path. >>> Just for more clarification, DN will send their stats as parts of >>>hertbeats, So, NN will maintain all the statistics about the diskspace >>>usage for the complete filesystem and etc... This api will give you that >>>stats. >>> >>> Regards, >>> Uma >>> >>> ----- Original Message ----- >>> From: [EMAIL PROTECTED] >>> Date: Monday, October 17, 2011 9:07 pm >>> Subject: Re: Is there a good way to see how full hdfs is >>> To: [EMAIL PROTECTED], [EMAIL PROTECTED] >>> Cc: [EMAIL PROTECTED] >>> >>>> So is there a client program to call this? >>>> >>>> Can one write their own simple client to call this method from all >>>> diskson the cluster? >>>> >>>> How about a map reduce job to collect from all disks on the cluster? >>>> >>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" >>>> <[EMAIL PROTECTED]>wrote: >>>> >>>> >/** Return the disk usage of the filesystem, including total >>>> capacity,> * used space, and remaining space */ >>>> > public DiskStatus getDiskStatus() throws IOException { >>>> > return dfs.getDiskStatus(); >>>> > } >>>> > >>>> >DistributedFileSystem has the above API from java API side. >>>> > >>>> >Regards, >>>> >Uma >>>> > >>>> >----- Original Message ----- >>>> >From: wd <[EMAIL PROTECTED]> >>>> >Date: Saturday, October 15, 2011 4:16 pm >>>> >Subject: Re: Is there a good way to see how full hdfs is >>>> >To: [EMAIL PROTECTED] >>>> > >>>> >> hadoop dfsadmin -report >>>> >> >>>> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis >>>> >> <[EMAIL PROTECTED]> wrote: >>>> >> > We have a small cluster with HDFS running on only 8 nodes - I >>>> >> believe that >>>> >> > the partition assigned to hdfs might be getting full and >>>> >> > wonder if the web tools or java api havew a way to look at free >>>> >> space on >>>> >> > hdfs >>>> >> > >>>> >> > -- >>>> >> > Steven M. Lewis PhD >>>> >> > 4221 105th Ave NE >>>> >> > Kirkland, WA 98033 >>>> >> > 206-384-1340 (cell) >>>> >> > Skype lordjoe_com >>>> >> > >>>> >> > >>>> >> > >>>> >> >>>> > >>>> >>>> >>> >> >> >> >>-- >>Harsh J >> >
+
Rajiv Chittajallu 2011-10-18, 01:04
-
Re: Is there a good way to see how full hdfs is
Ivan.Novick@... 2011-10-18, 16:23
Cool, is there any documentation on how to use the JMX stuff to get monitoring data? Cheers, Ivan On 10/17/11 6:04 PM, "Rajiv Chittajallu" <[EMAIL PROTECTED]> wrote: >If you are running > 0.20.204 > http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode, >name=NameNodeInfo > > >[EMAIL PROTECTED] wrote on 10/17/11 at 09:18:20 -0700: >>Hi Harsh, >> >>I need access to the data programatically for system automation, and >>hence >>I do not want a monitoring tool but access to the raw data. >> >>I am more than happy to use an exposed function or client program and not >>an internal API. >> >>So i am still a bit confused... What is the simplest way to get at this >>raw disk usage data programmatically? Is there a HDFS equivalent of du >>and df, or are you suggesting to just run that on the linux OS (which is >>perfectly doable). >> >>Cheers, >>Ivan >> >> >>On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote: >> >>>Uma/Ivan, >>> >>>The DistributedFileSystem class explicitly is _not_ meant for public >>>consumption, it is an internal one. Additionally, that method has been >>>deprecated. >>> >>>What you need is FileSystem#getStatus() if you want the summarized >>>report via code. >>> >>>A job, that possibly runs "du" or "df", is a good idea if you >>>guarantee perfect homogeneity of path names in your cluster. >>> >>>But I wonder, why won't using a general monitoring tool (such as >>>nagios) for this purpose cut it? What's the end goal here? >>> >>>P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I >>>see it being cross posted into mr-user, common-user, and common-dev -- >>>Why? >>> >>>On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 >>><[EMAIL PROTECTED]> wrote: >>>> We can write the simple program and you can call this API. >>>> >>>> Make sure Hadoop jars presents in your class path. >>>> Just for more clarification, DN will send their stats as parts of >>>>hertbeats, So, NN will maintain all the statistics about the diskspace >>>>usage for the complete filesystem and etc... This api will give you >>>>that >>>>stats. >>>> >>>> Regards, >>>> Uma >>>> >>>> ----- Original Message ----- >>>> From: [EMAIL PROTECTED] >>>> Date: Monday, October 17, 2011 9:07 pm >>>> Subject: Re: Is there a good way to see how full hdfs is >>>> To: [EMAIL PROTECTED], [EMAIL PROTECTED] >>>> Cc: [EMAIL PROTECTED] >>>> >>>>> So is there a client program to call this? >>>>> >>>>> Can one write their own simple client to call this method from all >>>>> diskson the cluster? >>>>> >>>>> How about a map reduce job to collect from all disks on the cluster? >>>>> >>>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" >>>>> <[EMAIL PROTECTED]>wrote: >>>>> >>>>> >/** Return the disk usage of the filesystem, including total >>>>> capacity,> * used space, and remaining space */ >>>>> > public DiskStatus getDiskStatus() throws IOException { >>>>> > return dfs.getDiskStatus(); >>>>> > } >>>>> > >>>>> >DistributedFileSystem has the above API from java API side. >>>>> > >>>>> >Regards, >>>>> >Uma >>>>> > >>>>> >----- Original Message ----- >>>>> >From: wd <[EMAIL PROTECTED]> >>>>> >Date: Saturday, October 15, 2011 4:16 pm >>>>> >Subject: Re: Is there a good way to see how full hdfs is >>>>> >To: [EMAIL PROTECTED] >>>>> > >>>>> >> hadoop dfsadmin -report >>>>> >> >>>>> >> On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis >>>>> >> <[EMAIL PROTECTED]> wrote: >>>>> >> > We have a small cluster with HDFS running on only 8 nodes - I >>>>> >> believe that >>>>> >> > the partition assigned to hdfs might be getting full and >>>>> >> > wonder if the web tools or java api havew a way to look at free >>>>> >> space on >>>>> >> > hdfs >>>>> >> > >>>>> >> > -- >>>>> >> > Steven M. Lewis PhD >>>>> >> > 4221 105th Ave NE >>>>> >> > Kirkland, WA 98033 >>>>> >> > 206-384-1340 (cell) >>>>> >> > Skype lordjoe_com >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> >>>>> > >>>>> >>>>>
+
Ivan.Novick@... 2011-10-18, 16:23
-
Re: Is there a good way to see how full hdfs is
Rajiv Chittajallu 2011-10-20, 00:00
[EMAIL PROTECTED] wrote on 10/18/11 at 09:23:50 -0700: >Cool, is there any documentation on how to use the JMX stuff to get >monitoring data? I don't know if there is any specific documentation. These are the mbeans you might be interested in Namenode: Hadoop:service=NameNode,name=FSNamesystemState Hadoop:service=NameNode,name=NameNodeInfo Hadoop:service=NameNode,name=jvm JobTracker: Hadoop:service=JobTracker,name=JobTrackerInfo Hadoop:service=JobTracker,name=QueueMetrics,q=<queuename> Hadoop:service=JobTracker,name=jvm DataNode: Hadoop:name=DataNodeInfo,service=DataNode TaskTracker: Hadoop:service=TaskTracker,name=TaskTrackerInfo You may also want to monitor shuffle_exceptions_caught in Hadoop:service=TaskTracker,name=ShuffleServerMetrics > >Cheers, >Ivan > >On 10/17/11 6:04 PM, "Rajiv Chittajallu" <[EMAIL PROTECTED]> wrote: > >>If you are running > 0.20.204 >> http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode, >>name=NameNodeInfo >> >> >>[EMAIL PROTECTED] wrote on 10/17/11 at 09:18:20 -0700: >>>Hi Harsh, >>> >>>I need access to the data programatically for system automation, and >>>hence >>>I do not want a monitoring tool but access to the raw data. >>> >>>I am more than happy to use an exposed function or client program and not >>>an internal API. >>> >>>So i am still a bit confused... What is the simplest way to get at this >>>raw disk usage data programmatically? Is there a HDFS equivalent of du >>>and df, or are you suggesting to just run that on the linux OS (which is >>>perfectly doable). >>> >>>Cheers, >>>Ivan >>> >>> >>>On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote: >>> >>>>Uma/Ivan, >>>> >>>>The DistributedFileSystem class explicitly is _not_ meant for public >>>>consumption, it is an internal one. Additionally, that method has been >>>>deprecated. >>>> >>>>What you need is FileSystem#getStatus() if you want the summarized >>>>report via code. >>>> >>>>A job, that possibly runs "du" or "df", is a good idea if you >>>>guarantee perfect homogeneity of path names in your cluster. >>>> >>>>But I wonder, why won't using a general monitoring tool (such as >>>>nagios) for this purpose cut it? What's the end goal here? >>>> >>>>P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I >>>>see it being cross posted into mr-user, common-user, and common-dev -- >>>>Why? >>>> >>>>On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 >>>><[EMAIL PROTECTED]> wrote: >>>>> We can write the simple program and you can call this API. >>>>> >>>>> Make sure Hadoop jars presents in your class path. >>>>> Just for more clarification, DN will send their stats as parts of >>>>>hertbeats, So, NN will maintain all the statistics about the diskspace >>>>>usage for the complete filesystem and etc... This api will give you >>>>>that >>>>>stats. >>>>> >>>>> Regards, >>>>> Uma >>>>> >>>>> ----- Original Message ----- >>>>> From: [EMAIL PROTECTED] >>>>> Date: Monday, October 17, 2011 9:07 pm >>>>> Subject: Re: Is there a good way to see how full hdfs is >>>>> To: [EMAIL PROTECTED], [EMAIL PROTECTED] >>>>> Cc: [EMAIL PROTECTED] >>>>> >>>>>> So is there a client program to call this? >>>>>> >>>>>> Can one write their own simple client to call this method from all >>>>>> diskson the cluster? >>>>>> >>>>>> How about a map reduce job to collect from all disks on the cluster? >>>>>> >>>>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" >>>>>> <[EMAIL PROTECTED]>wrote: >>>>>> >>>>>> >/** Return the disk usage of the filesystem, including total >>>>>> capacity,> * used space, and remaining space */ >>>>>> > public DiskStatus getDiskStatus() throws IOException { >>>>>> > return dfs.getDiskStatus(); >>>>>> > } >>>>>> > >>>>>> >DistributedFileSystem has the above API from java API side. >>>>>> > >>>>>> >Regards, >>>>>> >Uma >>>>>> > >>>>>> >----- Original Message ----- >>>>>> >From: wd <[EMAIL PROTECTED]> >>>>>> >Date: Saturday, October 15, 2011 4:16 pm
+
Rajiv Chittajallu 2011-10-20, 00:00
-
Re: Is there a good way to see how full hdfs is
Mapred Learn 2011-10-20, 14:31
Hi, I have same question regarding the documentation and : Is there something like this for memory and CPU utilization also ? Sent from my iPhone Thanks, JJ On Oct 19, 2011, at 5:00 PM, Rajiv Chittajallu <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote on 10/18/11 at 09:23:50 -0700: >> Cool, is there any documentation on how to use the JMX stuff to get >> monitoring data? > > I don't know if there is any specific documentation. These are the > mbeans you might be interested in > > Namenode: > > Hadoop:service=NameNode,name=FSNamesystemState > Hadoop:service=NameNode,name=NameNodeInfo > Hadoop:service=NameNode,name=jvm > > JobTracker: > > Hadoop:service=JobTracker,name=JobTrackerInfo > Hadoop:service=JobTracker,name=QueueMetrics,q=<queuename> > Hadoop:service=JobTracker,name=jvm > > DataNode: > Hadoop:name=DataNodeInfo,service=DataNode > > TaskTracker: > Hadoop:service=TaskTracker,name=TaskTrackerInfo > > You may also want to monitor shuffle_exceptions_caught in > Hadoop:service=TaskTracker,name=ShuffleServerMetrics > >> >> Cheers, >> Ivan >> >> On 10/17/11 6:04 PM, "Rajiv Chittajallu" <[EMAIL PROTECTED]> wrote: >> >>> If you are running > 0.20.204 >>> http://phanpy-nn1.hadoop.apache.org:50070/jmx?qry=Hadoop:service=NameNode, >>> name=NameNodeInfo >>> >>> >>> [EMAIL PROTECTED] wrote on 10/17/11 at 09:18:20 -0700: >>>> Hi Harsh, >>>> >>>> I need access to the data programatically for system automation, and >>>> hence >>>> I do not want a monitoring tool but access to the raw data. >>>> >>>> I am more than happy to use an exposed function or client program and not >>>> an internal API. >>>> >>>> So i am still a bit confused... What is the simplest way to get at this >>>> raw disk usage data programmatically? Is there a HDFS equivalent of du >>>> and df, or are you suggesting to just run that on the linux OS (which is >>>> perfectly doable). >>>> >>>> Cheers, >>>> Ivan >>>> >>>> >>>> On 10/17/11 9:05 AM, "Harsh J" <[EMAIL PROTECTED]> wrote: >>>> >>>>> Uma/Ivan, >>>>> >>>>> The DistributedFileSystem class explicitly is _not_ meant for public >>>>> consumption, it is an internal one. Additionally, that method has been >>>>> deprecated. >>>>> >>>>> What you need is FileSystem#getStatus() if you want the summarized >>>>> report via code. >>>>> >>>>> A job, that possibly runs "du" or "df", is a good idea if you >>>>> guarantee perfect homogeneity of path names in your cluster. >>>>> >>>>> But I wonder, why won't using a general monitoring tool (such as >>>>> nagios) for this purpose cut it? What's the end goal here? >>>>> >>>>> P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I >>>>> see it being cross posted into mr-user, common-user, and common-dev -- >>>>> Why? >>>>> >>>>> On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 >>>>> <[EMAIL PROTECTED]> wrote: >>>>>> We can write the simple program and you can call this API. >>>>>> >>>>>> Make sure Hadoop jars presents in your class path. >>>>>> Just for more clarification, DN will send their stats as parts of >>>>>> hertbeats, So, NN will maintain all the statistics about the diskspace >>>>>> usage for the complete filesystem and etc... This api will give you >>>>>> that >>>>>> stats. >>>>>> >>>>>> Regards, >>>>>> Uma >>>>>> >>>>>> ----- Original Message ----- >>>>>> From: [EMAIL PROTECTED] >>>>>> Date: Monday, October 17, 2011 9:07 pm >>>>>> Subject: Re: Is there a good way to see how full hdfs is >>>>>> To: [EMAIL PROTECTED], [EMAIL PROTECTED] >>>>>> Cc: [EMAIL PROTECTED] >>>>>> >>>>>>> So is there a client program to call this? >>>>>>> >>>>>>> Can one write their own simple client to call this method from all >>>>>>> diskson the cluster? >>>>>>> >>>>>>> How about a map reduce job to collect from all disks on the cluster? >>>>>>> >>>>>>> On 10/15/11 4:51 AM, "Uma Maheswara Rao G 72686" >>>>>>> <[EMAIL PROTECTED]>wrote: >>>>>>> >>>>>>>> /** Return the disk usage of the filesystem, including total
+
Mapred Learn 2011-10-20, 14:31
|
|