Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> map task attempt progress at 400%?


Copy link to this message
-
Re: map task attempt progress at 400%?
The first thing I would check is that your mappers are processing the
same amount of data. I'm not familiar with the Cassandra InputFormat,
but if it doesn't properly split the data, then you could end up with
this behavior. If the data is split properly, I'd look into swapping
as a possible cause.

Is it always the same nodes that are slow?

-Joey

On Thu, Nov 3, 2011 at 10:43 AM, Brendan W. <[EMAIL PROTECTED]> wrote:
> The input is actually performed by the apache-cassandra 0.6.9 api for
> map-reduce.  And yes, the cassandra row that is read into the mapper
> consists of a block of 100 compressed lines of text.  So maybe that
> accounts for the progress report.
>
> Any idea what the huge time difference might be due to (2 minutes average
> vs. 20 hrs for the last 3 tasks)?  Does that sound like swapping to you?
>
> Thanks,
>
> Brendan
>
> On Thu, Nov 3, 2011 at 9:44 AM, Joey Echeverria <[EMAIL PROTECTED]> wrote:
>
>> Is you input data compressed? There have been some bugs in the past
>> with reporting progress when reading compressed data.
>>
>> -Joey
>>
>> On Thu, Nov 3, 2011 at 9:18 AM, Brendan W. <[EMAIL PROTECTED]> wrote:
>> > Hi,
>> >
>> > Running 0.20.2:
>> >
>> > A job with about 4000 map tasks quickly blew through all but 3 in a
>> couple
>> > of hours, with the tasks taking about two minutes each.  The remaining
>> > three, however, inched along, with their progress passing 100% and
>> keeping
>> > on going.  After 20 hours or so, I killed the running task attempts.
>>  They
>> > restarted, and same thing:  they inched their way past 100%, getting up
>> > past 400% and continuing.  They finally finished in the middle of last
>> > night.
>> >
>> > What does progress > 100% indicate?
>> >
>> > Thanks for any help.
>> >
>>
>>
>>
>> --
>> Joseph Echeverria
>> Cloudera, Inc.
>> 443.305.9434
>>
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB