Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> map stucks at 99.99%


Copy link to this message
-
Re: map stucks at 99.99%
> What type of CPU is on the box ? load average seems pretty high for a 8-core
> box.
Xeon 3.07GHz, 24 cores

> Do you have ganglia on these boxes ? Is the load average always so high?
> What's the memory usage for the task and overall on the box ?
>From top -p pid of the task
CPU 143.2%  MEM 1.7%
So, it is not mem dried up on her, cpu is pretty pecked.

>
> How long has the map task been running in that stuck state ?
--> at least 2 hours.
It finally just finished after hours, it double on time used today.. T_T
On Thu, Feb 28, 2013 at 1:18 PM, Viral Bajaria <[EMAIL PROTECTED]> wrote:
> What type of CPU is on the box ? load average seems pretty high for a 8-core
> box. Do you have ganglia on these boxes ? Is the load average always so high
> ? What's the memory usage for the task and overall on the box ?
>
> How long has the map task been running in that stuck state ? If it's been a
> few minutes, I am surprised that the JT didn't try to run it on another node
> or have you switched off speculative execution ?
>
> Sorry too many questions !!
>
> You can try jstack, jmap. That will atleast tell you about what's getting
> blocked.
>
> On Thu, Feb 28, 2013 at 1:04 PM, Patai Sangbutsarakum
> <[EMAIL PROTECTED]> wrote:
>>
>> - Check the box on which the task is running, is it under heavy load ?
>> Is there high amount of I/O wait ?
>> CPU, very warm load average: 47.47, 48.56, 49.00
>> I/O, chill on io 0.1x % on iowait, less than 20 tps, rarely upto
>> 100tps, on 10 disks jbod.
>>
>>
>> - You could check the task logs and see if they say anything about
>> what is going wrong ?
>> I would say no.. pretty much all of them is INFO
>>
>> - Did the task get pre-empted to other task trackers ? If yes, is it
>> stuck at the same spot on those ?
>> Nope.
>>
>> - What kind of work are you doing in the mapper ? Just reading from
>> HDFS and compute something or reading/writing from HBase ?
>> HDFS + compute, R/W
>> Absolutely no HBase.
>>
>> Would jstack, jmap be any useful ?
>>
>>
>> > - You could check the task logs and see if they say anything about what
>> > is
>> > going wrong ?
>> > - Did the task get pre-empted to other task trackers ? If yes, is it
>> > stuck
>> > at the same spot on those ?
>> > - What kind of work are you doing in the mapper ? Just reading from HDFS
>> > and
>> > compute something or reading/writing from HBase ?
>>
>> On Thu, Feb 28, 2013 at 12:25 PM, Viral Bajaria <[EMAIL PROTECTED]>
>> wrote:
>> > You could start off doing the following:
>> >
>> > - Check the box on which the task is running, is it under heavy load ?
>> > Is
>> > there high amount of I/O wait ?
>> > - You could check the task logs and see if they say anything about what
>> > is
>> > going wrong ?
>> > - Did the task get pre-empted to other task trackers ? If yes, is it
>> > stuck
>> > at the same spot on those ?
>> > - What kind of work are you doing in the mapper ? Just reading from HDFS
>> > and
>> > compute something or reading/writing from HBase ?
>> >
>> > Thanks,
>> > Viral
>> >
>> > On Thu, Feb 28, 2013 at 12:06 PM, Patai Sangbutsarakum
>> > <[EMAIL PROTECTED]> wrote:
>> >>
>> >> Hadoopers!!
>> >>
>> >> Need input from you guys,
>> >> i am looking at a critical job in production. it stucks at 99.99% in
>> >> map phrase for much longer than it used to be..
>> >>
>> >> what to do to debug what is going on with those map why it is not pass
>> >> through
>> >> even though tasks and task attempts saying 100% progress but there is
>> >> not finish time...
>> >>
>> >> Please suggest
>> >> Patai
>> >
>> >
>
>