|
|
+
Patai Sangbutsarakum 2013-02-28, 20:06
+
Patai Sangbutsarakum 2013-02-28, 21:04
+
Viral Bajaria 2013-02-28, 21:18
-
Re: map stucks at 99.99%Patai Sangbutsarakum 2013-02-28, 21:28
> What type of CPU is on the box ? load average seems pretty high for a 8-core
> box. Xeon 3.07GHz, 24 cores > Do you have ganglia on these boxes ? Is the load average always so high? > What's the memory usage for the task and overall on the box ? >From top -p pid of the task CPU 143.2% MEM 1.7% So, it is not mem dried up on her, cpu is pretty pecked. > > How long has the map task been running in that stuck state ? --> at least 2 hours. It finally just finished after hours, it double on time used today.. T_T On Thu, Feb 28, 2013 at 1:18 PM, Viral Bajaria <[EMAIL PROTECTED]> wrote: > What type of CPU is on the box ? load average seems pretty high for a 8-core > box. Do you have ganglia on these boxes ? Is the load average always so high > ? What's the memory usage for the task and overall on the box ? > > How long has the map task been running in that stuck state ? If it's been a > few minutes, I am surprised that the JT didn't try to run it on another node > or have you switched off speculative execution ? > > Sorry too many questions !! > > You can try jstack, jmap. That will atleast tell you about what's getting > blocked. > > On Thu, Feb 28, 2013 at 1:04 PM, Patai Sangbutsarakum > <[EMAIL PROTECTED]> wrote: >> >> - Check the box on which the task is running, is it under heavy load ? >> Is there high amount of I/O wait ? >> CPU, very warm load average: 47.47, 48.56, 49.00 >> I/O, chill on io 0.1x % on iowait, less than 20 tps, rarely upto >> 100tps, on 10 disks jbod. >> >> >> - You could check the task logs and see if they say anything about >> what is going wrong ? >> I would say no.. pretty much all of them is INFO >> >> - Did the task get pre-empted to other task trackers ? If yes, is it >> stuck at the same spot on those ? >> Nope. >> >> - What kind of work are you doing in the mapper ? Just reading from >> HDFS and compute something or reading/writing from HBase ? >> HDFS + compute, R/W >> Absolutely no HBase. >> >> Would jstack, jmap be any useful ? >> >> >> > - You could check the task logs and see if they say anything about what >> > is >> > going wrong ? >> > - Did the task get pre-empted to other task trackers ? If yes, is it >> > stuck >> > at the same spot on those ? >> > - What kind of work are you doing in the mapper ? Just reading from HDFS >> > and >> > compute something or reading/writing from HBase ? >> >> On Thu, Feb 28, 2013 at 12:25 PM, Viral Bajaria <[EMAIL PROTECTED]> >> wrote: >> > You could start off doing the following: >> > >> > - Check the box on which the task is running, is it under heavy load ? >> > Is >> > there high amount of I/O wait ? >> > - You could check the task logs and see if they say anything about what >> > is >> > going wrong ? >> > - Did the task get pre-empted to other task trackers ? If yes, is it >> > stuck >> > at the same spot on those ? >> > - What kind of work are you doing in the mapper ? Just reading from HDFS >> > and >> > compute something or reading/writing from HBase ? >> > >> > Thanks, >> > Viral >> > >> > On Thu, Feb 28, 2013 at 12:06 PM, Patai Sangbutsarakum >> > <[EMAIL PROTECTED]> wrote: >> >> >> >> Hadoopers!! >> >> >> >> Need input from you guys, >> >> i am looking at a critical job in production. it stucks at 99.99% in >> >> map phrase for much longer than it used to be.. >> >> >> >> what to do to debug what is going on with those map why it is not pass >> >> through >> >> even though tasks and task attempts saying 100% progress but there is >> >> not finish time... >> >> >> >> Please suggest >> >> Patai >> > >> > > > +
Matt Davies 2013-02-28, 22:10
+
YouPeng Yang 2013-03-02, 02:36
|