Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - High load on datanode startup


+
Darrell Taylor 2012-05-09, 16:52
+
Raj Vishwanathan 2012-05-09, 21:23
+
Darrell Taylor 2012-05-09, 21:40
+
Raj Vishwanathan 2012-05-09, 21:52
+
Darrell Taylor 2012-05-10, 06:57
+
Todd Lipcon 2012-05-10, 08:33
+
Darrell Taylor 2012-05-10, 10:57
+
Raj Vishwanathan 2012-05-10, 16:58
+
Darrell Taylor 2012-05-11, 09:29
Copy link to this message
-
Re: High load on datanode startup
Todd Lipcon 2012-05-11, 09:32
On Fri, May 11, 2012 at 2:29 AM, Darrell Taylor
<[EMAIL PROTECTED]> wrote:
>
> What I saw on the machine was thousands of recursive processes in ps of the
> form 'bash /usr/bin/hbase classpath...',  Stopping everything didn't clean
> the processes up so had to kill them manually with some grep/xargs foo.
>  Once this was all cleaned up and the hadoop-env.sh file removed the nodes
> seem to be happy again.

Ah -- maybe the issue is that... my guess is that "hbase classpath" is
now trying to include the Hadoop dependencies using "hadoop
classpath". But "hadoop classpath" was recursing right back because of
that setting in hadoop-env. Basically you made a fork bomb - that
explains the shape of the graph in Ganglia perfectly.

-Todd

>
> Darrell.
>
>
>>
>> Raj
>>
>>
>>
>> >________________________________
>> > From: Darrell Taylor <[EMAIL PROTECTED]>
>> >To: [EMAIL PROTECTED]
>> >Cc: Raj Vishwanathan <[EMAIL PROTECTED]>
>> >Sent: Thursday, May 10, 2012 3:57 AM
>> >Subject: Re: High load on datanode startup
>> >
>> >On Thu, May 10, 2012 at 9:33 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
>> >
>> >> That's real weird..
>> >>
>> >> If you can reproduce this after a reboot, I'd recommend letting the DN
>> >> run for a minute, and then capturing a "jstack <pid of dn>" as well as
>> >> the output of "top -H -p <pid of dn> -b -n 5" and send it to the list.
>> >
>> >
>> >What I did after the reboot this morning was to move the my dn, nn, and
>> >mapred directories out of the the way, create a new one, formatted it, and
>> >restarted the node, it's now happy.
>> >
>> >I'll try moving the directories back later and do the jstack as you
>> suggest.
>> >
>> >
>> >>
>> >> What JVM/JDK are you using? What OS version?
>> >>
>> >
>> >root@pl446:/# dpkg --get-selections | grep java
>> >java-common                                     install
>> >libjaxp1.3-java                                 install
>> >libjaxp1.3-java-gcj                             install
>> >libmysql-java                                   install
>> >libxerces2-java                                 install
>> >libxerces2-java-gcj                             install
>> >sun-java6-bin                                   install
>> >sun-java6-javadb                                install
>> >sun-java6-jdk                                   install
>> >sun-java6-jre                                   install
>> >
>> >root@pl446:/# java -version
>> >java version "1.6.0_26"
>> >Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
>> >Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>> >
>> >root@pl446:/# cat /etc/issue
>> >Debian GNU/Linux 6.0 \n \l
>> >
>> >
>> >
>> >>
>> >> -Todd
>> >>
>> >>
>> >> On Wed, May 9, 2012 at 11:57 PM, Darrell Taylor
>> >> <[EMAIL PROTECTED]> wrote:
>> >> > On Wed, May 9, 2012 at 10:52 PM, Raj Vishwanathan <[EMAIL PROTECTED]>
>> >> wrote:
>> >> >
>> >> >> The picture either too small or too pixelated for my eyes :-)
>> >> >>
>> >> >
>> >> > There should be a zoom option in the top right of the page that allows
>> >> you
>> >> > to view it full size
>> >> >
>> >> >
>> >> >>
>> >> >> Can you login to the box and send the output of top? If the system is
>> >> >> unresponsive, it has to be something more than an unbalanced hdfs
>> >> cluster,
>> >> >> methinks.
>> >> >>
>> >> >
>> >> > Sorry, I'm unable to login to the box, it's completely unresponsive.
>> >> >
>> >> >
>> >> >>
>> >> >> Raj
>> >> >>
>> >> >>
>> >> >>
>> >> >> >________________________________
>> >> >> > From: Darrell Taylor <[EMAIL PROTECTED]>
>> >> >> >To: [EMAIL PROTECTED]; Raj Vishwanathan <
>> [EMAIL PROTECTED]
>> >> >
>> >> >> >Sent: Wednesday, May 9, 2012 2:40 PM
>> >> >> >Subject: Re: High load on datanode startup
>> >> >> >
>> >> >> >On Wed, May 9, 2012 at 10:23 PM, Raj Vishwanathan <
>> [EMAIL PROTECTED]>
>> >> >> wrote:
>> >> >> >
>> >> >> >> When you say 'load', what do you mean? CPU load or something else?
>> >> >> >>
>
Todd Lipcon
Software Engineer, Cloudera
+
Harsh J 2012-05-11, 10:36
+
Serge Blazhiyevskyy 2012-05-09, 16:56
+
Darrell Taylor 2012-05-09, 16:58
+
Serge Blazhiyevskyy 2012-05-09, 17:04
+
Darrell Taylor 2012-05-09, 19:23
+
Serge Blazhiyevskyy 2012-05-09, 21:00
+
Darrell Taylor 2012-05-09, 21:27
+
Serge Blazhiyevskyy 2012-05-09, 21:44