Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - High load on datanode startup


Copy link to this message
-
Re: High load on datanode startup
Raj Vishwanathan 2012-05-09, 21:52
The picture either too small or too pixelated for my eyes :-)

Can you login to the box and send the output of top? If the system is unresponsive, it has to be something more than an unbalanced hdfs cluster, methinks.

Raj

>________________________________
> From: Darrell Taylor <[EMAIL PROTECTED]>
>To: [EMAIL PROTECTED]; Raj Vishwanathan <[EMAIL PROTECTED]>
>Sent: Wednesday, May 9, 2012 2:40 PM
>Subject: Re: High load on datanode startup
>
>On Wed, May 9, 2012 at 10:23 PM, Raj Vishwanathan <[EMAIL PROTECTED]> wrote:
>
>> When you say 'load', what do you mean? CPU load or something else?
>>
>
>I mean in the unix sense of load average, i.e. top would show a load of
>(currently) 376.
>
>Looking at Ganglia stats for the box it's not CPU load as such, the graphs
>shows actual CPU usage as 30%, but the number of running processes is
>simply growing in a linear manner - screen shot of ganglia page here :
>
>https://picasaweb.google.com/lh/photo/Q0uFSzyLiriDuDnvyRUikXVR0iWwMibMfH0upnTwi28?feat=directlink
>
>
>
>>
>> Raj
>>
>>
>>
>> >________________________________
>> > From: Darrell Taylor <[EMAIL PROTECTED]>
>> >To: [EMAIL PROTECTED]
>> >Sent: Wednesday, May 9, 2012 9:52 AM
>> >Subject: High load on datanode startup
>> >
>> >Hi,
>> >
>> >I wonder if someone could give some pointers with a problem I'm having?
>> >
>> >I have a 7 machine cluster setup for testing and we have been pouring data
>> >into it for a week without issue, have learnt several thing along the way
>> >and solved all the problems up to now by searching online, but now I'm
>> >stuck.  One of the data nodes decided to have a load of 70+ this morning,
>> >stopping datanode and tasktracker brought it back to normal, but every
>> time
>> >I start the datanode again the load shoots through the roof, and all I get
>> >in the logs is :
>> >
>> >STARTUP_MSG: Starting DataNode
>> >
>> >
>> >STARTUP_MSG:   host = pl464/10.20.16.64
>> >
>> >
>> >STARTUP_MSG:   args = []
>> >
>> >
>> >STARTUP_MSG:   version = 0.20.2-cdh3u3
>> >
>> >
>> >STARTUP_MSG:   build >>
>> >file:///data/1/tmp/nightly_2012-03-20_13-13-48_3/hadoop-0.20-0.20.2+923.197-1~squeeze
>> >-************************************************************/
>> >
>> >
>> >2012-05-09 16:12:05,925 INFO
>> >org.apache.hadoop.security.UserGroupInformation: JAAS Configuration
>> already
>> >set up for Hadoop, not re-installing.
>> >
>> >2012-05-09 16:12:06,139 INFO
>> >org.apache.hadoop.security.UserGroupInformation: JAAS Configuration
>> already
>> >set up for Hadoop, not re-installing.
>> >
>> >Nothing else.
>> >
>> >The load seems to max out only 1 of the CPUs, but the machine becomes
>> >*very* unresponsive
>> >
>> >Anybody got any pointers of things I can try?
>> >
>> >Thanks
>> >Darrell.
>> >
>> >
>> >
>>
>
>
>