Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Re: Job cleanup


Copy link to this message
-
Re: Job cleanup
Robert Dyer 2013-04-17, 08:44
I think the problem is I need to report progress() from my cleanup task.
How can I do this?

The commitJob() in my custom
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter[1]
only provides org.apache.hadoop.mapreduce.JobContext[2]
which has no getProgressible() like the old
org.apache.hadoop.mapred.JobContext[3].

[1]
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.html#commitJob%28org.apache.hadoop.mapreduce.JobContext%29
[2]
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/JobContext.html
[3]
http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/JobContext.html#getProgressible%28%29

On Sat, Apr 13, 2013 at 2:35 PM, Robert Dyer <[EMAIL PROTECTED]> wrote:

> What does the job cleanup task do?  My understanding was it just cleaned
> up any intermediate/temporary files and moved the reducer output to the
> output directory?  Does it do more?
>
> One of my jobs runs, all maps and reduces finish, but then the job cleanup
> task never finishes.  Instead it gets killed several times until the entire
> Job gets killed:
>
> Task attempt_201303272327_0772_m_000105_0 failed to report status for 600 seconds. Killing!
>
>
> I suppose that since my reducers generate around 20GB of output, that
> perhaps moving it takes too long?
>
> Is it possible to disable speculative execution *only* for the cleanup
> task?
>