Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> determining what files made up a failing task


Copy link to this message
-
Re: determining what files made up a failing task
Matt,

I could not find the properties in the documentation, so I mentioned this
feature as hidden. As Harsh mentioned there is an API.

There was a blog entry on '
Automatically Documenting Apache Hadoop Configuration' from Cloudera. It
would be great if it is contributed to Apache and made part of the build
process. I suggested it before, but there was no response.

http://www.cloudera.com/blog/2011/08/automatically-documenting-apache-hadoop-configuration/

Regards,
Praveen

On Sun, Dec 4, 2011 at 9:07 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Mat,
>
> Perhaps you can simply set a percentage of failure toleration for your job.
>
> Doable via
> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxMapTaskFailuresPercent(int)
>  and
> http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobConf.html#setMaxReduceTaskFailuresPercent(int)
>
> If you set it to 10%, your job still passes if 10% of total Map or Reduce
> tasks failed. I think this fits your use-case.
>
> On 04-Dec-2011, at 4:05 AM, Mat Kelcey wrote:
>
> Hi folks,
>
> I have a Hadoop 0.20.2 map only job with thousands of inputs tasks;
> I'm using the org.apache.nutch.tools.arc.ArcInputFormat input format
> so each task corresponds to a single file in HDFS
>
> Most of the way into the job it hits a task that causes the input
> format to OOM. After 4 attempts it fails the job.
> Now this is obviously not great but for the purpose of my job I'd be
> happy to just throw this input file away, it's only one of thousands
> and I don't need exact results.
>
> The trouble is I can't work out where what file this task corresponds to?
>
> The closest I can find is that the job history file lists a STATE_STRING
> ( eg STATE_STRING="
> hdfs://ip-10-115-29-44\.ec2\.internal:9000/user/hadoop/arc_files\.aa/2009/09/17/0/1253240925734_0\.arc\.gz:0+100425468
> "
> )
>
> but this is _only_ for the successfully completed ones, for the failed
> one I'm actually interested in there is nothing
> MapAttempt TASK_TYPE="MAP" TASKID="task_201112030459_0011_m_004130"
> TASK_ATTEMPT_ID="attempt_201112030459_0011_m_004130_0"
> TASK_STATUS="FAILED" FINISH_TIME="1322901661261"
> HOSTNAME="ip-10-218-57-227\.ec2\.internal" ERROR="Error: null" .
>
> I grepped through all the hadoop logs and couldn't find anything that
> relates this task to the files in it's split
> Any ideas where this info might be recorded?
>
> Cheers,
> Mat
>
>
>