Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - collecting CPU, mem, iops of hadoop jobs


Copy link to this message
-
Re: collecting CPU, mem, iops of hadoop jobs
Patai Sangbutsarakum 2011-12-20, 22:55
Thanks again Arun, you save me again.. :-)

This is a great starting point. for CPU and possibly Mem.

For the IOPS, just would like to ask if the tasknode/datanode collect the number
or we should dig into OS level.. like /proc/PID_OF_tt/io
^hope this make sense

-P

On Tue, Dec 20, 2011 at 1:22 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote:
> Take a look at the JobHistory files produced for each job.
>
> With 0.20.205 you get CPU (slot millis).
> With 0.23 (alpha quality) you get CPU and JVM metrics (GC etc.). I believe you also get Memory, but not IOPS.
>
> Arun
>
> On Dec 20, 2011, at 1:11 PM, Patai Sangbutsarakum wrote:
>
>> Thanks for reply, but I don't think metric exposed to Ganglia would be
>> what i am really looking for..
>>
>> what i am looking for is some kind of these (but not limit to)
>>
>> Job_xxxx_yyyy
>> CPU time: 10204 sec.   <--aggregate from all tasknodes
>> IOPS: 2344  <-- aggregated from all datanode
>> MEM: 30G   <-- aggregated
>>
>> etc,
>>
>> Job_aaa_bbb
>> CPU time:
>> IOPS:
>> MEM:
>>
>> Sorry for ambiguous question.
>> Thanks
>>
>> On Tue, Dec 20, 2011 at 12:47 PM, He Chen <[EMAIL PROTECTED]> wrote:
>>> You may need Ganglia. It is a cluster monitoring software.
>>>
>>> On Tue, Dec 20, 2011 at 2:44 PM, Patai Sangbutsarakum <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Hi Hadoopers,
>>>>
>>>> We're running Hadoop 0.20 CentOS5.5. I am finding the way to collect
>>>> CPU time, memory usage, IOPS of each hadoop Job.
>>>> What would be the good starting point ? document ? api ?
>>>>
>>>> Thanks in advance
>>>> -P
>>>>
>