Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> map stucks at 99.99%


+
Patai Sangbutsarakum 2013-02-28, 20:06
+
Patai Sangbutsarakum 2013-02-28, 21:04
Copy link to this message
-
Re: map stucks at 99.99%
What type of CPU is on the box ? load average seems pretty high for a
8-core box. Do you have ganglia on these boxes ? Is the load average always
so high ? What's the memory usage for the task and overall on the box ?

How long has the map task been running in that stuck state ? If it's been a
few minutes, I am surprised that the JT didn't try to run it on another
node or have you switched off speculative execution ?

Sorry too many questions !!

You can try jstack, jmap. That will atleast tell you about what's getting
blocked.

On Thu, Feb 28, 2013 at 1:04 PM, Patai Sangbutsarakum <
[EMAIL PROTECTED]> wrote:

> - Check the box on which the task is running, is it under heavy load ?
> Is there high amount of I/O wait ?
> CPU, very warm load average: 47.47, 48.56, 49.00
> I/O, chill on io 0.1x % on iowait, less than 20 tps, rarely upto
> 100tps, on 10 disks jbod.
>
>
> - You could check the task logs and see if they say anything about
> what is going wrong ?
> I would say no.. pretty much all of them is INFO
>
> - Did the task get pre-empted to other task trackers ? If yes, is it
> stuck at the same spot on those ?
> Nope.
>
> - What kind of work are you doing in the mapper ? Just reading from
> HDFS and compute something or reading/writing from HBase ?
> HDFS + compute, R/W
> Absolutely no HBase.
>
> Would jstack, jmap be any useful ?
>
>
> > - You could check the task logs and see if they say anything about what
> is
> > going wrong ?
> > - Did the task get pre-empted to other task trackers ? If yes, is it
> stuck
> > at the same spot on those ?
> > - What kind of work are you doing in the mapper ? Just reading from HDFS
> and
> > compute something or reading/writing from HBase ?
>
> On Thu, Feb 28, 2013 at 12:25 PM, Viral Bajaria <[EMAIL PROTECTED]>
> wrote:
> > You could start off doing the following:
> >
> > - Check the box on which the task is running, is it under heavy load ? Is
> > there high amount of I/O wait ?
> > - You could check the task logs and see if they say anything about what
> is
> > going wrong ?
> > - Did the task get pre-empted to other task trackers ? If yes, is it
> stuck
> > at the same spot on those ?
> > - What kind of work are you doing in the mapper ? Just reading from HDFS
> and
> > compute something or reading/writing from HBase ?
> >
> > Thanks,
> > Viral
> >
> > On Thu, Feb 28, 2013 at 12:06 PM, Patai Sangbutsarakum
> > <[EMAIL PROTECTED]> wrote:
> >>
> >> Hadoopers!!
> >>
> >> Need input from you guys,
> >> i am looking at a critical job in production. it stucks at 99.99% in
> >> map phrase for much longer than it used to be..
> >>
> >> what to do to debug what is going on with those map why it is not pass
> >> through
> >> even though tasks and task attempts saying 100% progress but there is
> >> not finish time...
> >>
> >> Please suggest
> >> Patai
> >
> >
>
+
Patai Sangbutsarakum 2013-02-28, 21:28
+
Matt Davies 2013-02-28, 22:10
+
YouPeng Yang 2013-03-02, 02:36