Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MapReduce Failed and Killed


Copy link to this message
-
Re: MapReduce Failed and Killed
Any MapReduce task needs to communicate with the tasktracker that launched
it periodically in order to let the tasktracker know it is still alive and
active. The time for which silence is tolerated is controlled by a
configuration property mapred.task.timeout.

It looks like in your case, this has already been bumped up to 20 minutes
(from the default 10 minutes). It also looks like this is not sufficient.
You could bump this value even further up. However, the correct approach
could be to see what the reducer is actually doing to become inactive
during this time. Can you look at the reducer attempt's logs (which you can
access from the web UI of the Jobtracker) and post them here ?

Thanks
hemanth
On Fri, Mar 22, 2013 at 5:32 PM, Jinchun Kim <[EMAIL PROTECTED]> wrote:

> Hi, All.
>
> I'm trying to create category-based splits of Wikipedia dataset(41GB) and
> the training data set(5GB) using Mahout.
> I'm using following command.
>
> $MAHOUT_HOME/bin/mahout wikipediaDataSetCreator -i wikipedia/chunks -o
> wikipediainput -c $MAHOUT_HOME/examples/temp/categories.txt
>
> I had no problem with the training data set, but Hadoop showed following
> messages
> when I tried to do a same job with Wikipedia dataset,
>
> .........
> 13/03/21 22:31:00 INFO mapred.JobClient:  map 27% reduce 1%
> 13/03/21 22:40:31 INFO mapred.JobClient:  map 27% reduce 2%
> 13/03/21 22:58:49 INFO mapred.JobClient:  map 27% reduce 3%
> 13/03/21 23:22:57 INFO mapred.JobClient:  map 27% reduce 4%
> 13/03/21 23:46:32 INFO mapred.JobClient:  map 27% reduce 5%
> 13/03/22 00:27:14 INFO mapred.JobClient:  map 27% reduce 6%
> 13/03/22 01:06:55 INFO mapred.JobClient:  map 27% reduce 7%
> 13/03/22 01:14:06 INFO mapred.JobClient:  map 27% reduce 3%
> 13/03/22 01:15:35 INFO mapred.JobClient: Task Id :
> attempt_201303211339_0002_r_000000_1, Status : FAILED
> Task attempt_201303211339_0002_r_000000_1 failed to report status for 1200
> seconds. Killing!
> 13/03/22 01:20:09 INFO mapred.JobClient:  map 27% reduce 4%
> 13/03/22 01:33:35 INFO mapred.JobClient: Task Id :
> attempt_201303211339_0002_m_000037_1, Status : FAILED
> Task attempt_201303211339_0002_m_000037_1 failed to report status for 1228
> seconds. Killing!
> 13/03/22 01:35:12 INFO mapred.JobClient:  map 27% reduce 5%
> 13/03/22 01:40:38 INFO mapred.JobClient:  map 27% reduce 6%
> 13/03/22 01:52:28 INFO mapred.JobClient:  map 27% reduce 7%
> 13/03/22 02:16:27 INFO mapred.JobClient:  map 27% reduce 8%
> 13/03/22 02:19:02 INFO mapred.JobClient: Task Id :
> attempt_201303211339_0002_m_000018_1, Status : FAILED
> Task attempt_201303211339_0002_m_000018_1 failed to report status for 1204
> seconds. Killing!
> 13/03/22 02:49:03 INFO mapred.JobClient:  map 27% reduce 9%
> 13/03/22 02:52:04 INFO mapred.JobClient:  map 28% reduce 9%
> ........
>
> Because I just started to learn how to run Hadoop, I have no idea how to
> solve
> this problem...
> Does anyone have an idea how to handle this weird thing?
>
> --
> *Jinchun Kim*
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB