Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> determining what files made up a failing task


Copy link to this message
-
Re: determining what files made up a failing task
Matt,

I could not find the properties in the documentation, so I mentioned this
feature as hidden. As Harsh mentioned there is an API.

There was a blog entry on '
Automatically Documenting Apache Hadoop Configuration' from Cloudera. It
would be great if it is contributed to Apache and made part of the build
process. I suggested it before, but there was no response.

http://www.cloudera.com/blog/2011/08/automatically-documenting-apache-hadoop-configuration/

Regards,
Praveen

On Sun, Dec 4, 2011 at 9:07 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Mat,
>
> Perhaps you can simply set a percentage of failure toleration for your job.
>
> Doable via
> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int)
>  and
> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int)
>
> If you set it to 10%, your job still passes if 10% of total Map or Reduce
> tasks failed. I think this fits your use-case.
>
> On 04-Dec-2011, at 4:05 AM, Mat Kelcey wrote:
>
> Hi folks,
>
> I have a Hadoop 0.20.2 map only job with thousands of inputs tasks;
> I'm using the org.apache.nutch.tools.arc.ArcInputFormat input format
> so each task corresponds to a single file in HDFS
>
> Most of the way into the job it hits a task that causes the input
> format to OOM. After 4 attempts it fails the job.
> Now this is obviously not great but for the purpose of my job I'd be
> happy to just throw this input file away, it's only one of thousands
> and I don't need exact results.
>
> The trouble is I can't work out where what file this task corresponds to?
>
> The closest I can find is that the job history file lists a STATE_STRING
> ( eg STATE_STRING="
> hdfs://ip-10-115-29-44\.ec2\.internal:9000/user/hadoop/arc_files\.aa/2009/09/17/0/1253240925734_0\.arc\.gz:0+100425468
> "
> )
>
> but this is _only_ for the successfully completed ones, for the failed
> one I'm actually interested in there is nothing
> MapAttempt TASK_TYPE="MAP" TASKID="task_201112030459_0011_m_004130"
> TASK_ATTEMPT_ID="attempt_201112030459_0011_m_004130_0"
> TASK_STATUS="FAILED" FINISH_TIME="1322901661261"
> HOSTNAME="ip-10-218-57-227\.ec2\.internal" ERROR="Error: null" .
>
> I grepped through all the hadoop logs and couldn't find anything that
> relates this task to the files in it's split
> Any ideas where this info might be recorded?
>
> Cheers,
> Mat
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB