Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> map stucks at 99.99%


Copy link to this message
-
Re: map stucks at 99.99%
Hi Patai
   I found a similar explanation on the google mapreduce publication.

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/zh-CN//archive/mapreduce-osdi04.pdf

   Please refere to the chapter:3.6 Backup Tasks

Hope to be helpful

regards

2013/3/1 Matt Davies <[EMAIL PROTECTED]>

> I've seen this before if the input data stream changes suddenly and does
> not lend itself to parallelization such as counting the number of tuples in
> a bag.
>
> One think that may be interesting are the job counters from a previous job
> vs this job that just completed.  Do they differ? Is there a particular
> mapper that seems to have counts that are way out of whack?
>
> Has someone tweaked the production job in one way or another?
>
>
>
>
> On Thu, Feb 28, 2013 at 1:28 PM, Patai Sangbutsarakum <
> [EMAIL PROTECTED]> wrote:
>
>> > What type of CPU is on the box ? load average seems pretty high for a
>> 8-core
>> > box.
>> Xeon 3.07GHz, 24 cores
>>
>> > Do you have ganglia on these boxes ? Is the load average always so high?
>> > What's the memory usage for the task and overall on the box ?
>> From top -p pid of the task
>> CPU 143.2%  MEM 1.7%
>> So, it is not mem dried up on her, cpu is pretty pecked.
>>
>> >
>> > How long has the map task been running in that stuck state ?
>> --> at least 2 hours.
>>
>>
>> It finally just finished after hours, it double on time used today.. T_T
>>
>>
>>
>>
>>
>>
>> On Thu, Feb 28, 2013 at 1:18 PM, Viral Bajaria <[EMAIL PROTECTED]>
>> wrote:
>> > What type of CPU is on the box ? load average seems pretty high for a
>> 8-core
>> > box. Do you have ganglia on these boxes ? Is the load average always so
>> high
>> > ? What's the memory usage for the task and overall on the box ?
>> >
>> > How long has the map task been running in that stuck state ? If it's
>> been a
>> > few minutes, I am surprised that the JT didn't try to run it on another
>> node
>> > or have you switched off speculative execution ?
>> >
>> > Sorry too many questions !!
>> >
>> > You can try jstack, jmap. That will atleast tell you about what's
>> getting
>> > blocked.
>> >
>> > On Thu, Feb 28, 2013 at 1:04 PM, Patai Sangbutsarakum
>> > <[EMAIL PROTECTED]> wrote:
>> >>
>> >> - Check the box on which the task is running, is it under heavy load ?
>> >> Is there high amount of I/O wait ?
>> >> CPU, very warm load average: 47.47, 48.56, 49.00
>> >> I/O, chill on io 0.1x % on iowait, less than 20 tps, rarely upto
>> >> 100tps, on 10 disks jbod.
>> >>
>> >>
>> >> - You could check the task logs and see if they say anything about
>> >> what is going wrong ?
>> >> I would say no.. pretty much all of them is INFO
>> >>
>> >> - Did the task get pre-empted to other task trackers ? If yes, is it
>> >> stuck at the same spot on those ?
>> >> Nope.
>> >>
>> >> - What kind of work are you doing in the mapper ? Just reading from
>> >> HDFS and compute something or reading/writing from HBase ?
>> >> HDFS + compute, R/W
>> >> Absolutely no HBase.
>> >>
>> >> Would jstack, jmap be any useful ?
>> >>
>> >>
>> >> > - You could check the task logs and see if they say anything about
>> what
>> >> > is
>> >> > going wrong ?
>> >> > - Did the task get pre-empted to other task trackers ? If yes, is it
>> >> > stuck
>> >> > at the same spot on those ?
>> >> > - What kind of work are you doing in the mapper ? Just reading from
>> HDFS
>> >> > and
>> >> > compute something or reading/writing from HBase ?
>> >>
>> >> On Thu, Feb 28, 2013 at 12:25 PM, Viral Bajaria <
>> [EMAIL PROTECTED]>
>> >> wrote:
>> >> > You could start off doing the following:
>> >> >
>> >> > - Check the box on which the task is running, is it under heavy load
>> ?
>> >> > Is
>> >> > there high amount of I/O wait ?
>> >> > - You could check the task logs and see if they say anything about
>> what
>> >> > is
>> >> > going wrong ?
>> >> > - Did the task get pre-empted to other task trackers ? If yes, is it
>> >> > stuck
>> >> > at the same spot on those ?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB