-Re: Pig 0.11.1 on AWS EMR/S3 fails to cleanup failed task output file before retrying that task
Cheolsoo Park 2013-06-13, 18:18
>> Seems like this is exactly the kind of task restart that should "just
work" if the garbage from the failed task were properly cleaned up.
Unfortunately,this is not the case because of S3 eventual consistency. Even
though the failed task cleans up files on S3, since delete is not
immediately propagated on S3, the next task may still see them and fail. As
far as I know, EMR Pig/S3 integration is not as good as EMR Hive/S3
integration. So you will have handle S3 eventual consistency by yourself in
One workaround is to write StoreFunc that stages data to HDFS until task
completes and then copies them to S3 at commit task step. This will
minimize the number of S3 eventual consistency issues you see.
On Thu, Jun 13, 2013 at 7:40 AM, Alan Crosswell <[EMAIL PROTECTED]> wrote:
> The file did not exist until the first task attempt created it before it
> was killed. As such the subsequent task attempts were guaranteed to fail
> since the killed task's output file had not be cleaned up. So when I
> launched the Pig script, there was no file in the way.
> I'll take a look at upping the timeout.
> On Thu, Jun 13, 2013 at 9:57 AM, Dan DeCapria, CivicScience <
> [EMAIL PROTECTED]> wrote:
> > Hi Alan,
> > I believe this is expected behavior wrt EMR and S3. There cannot exist a
> > duplicate file path in S3 prior to commit; in your case it looks like
> > bucket: n2ygk, path: reduced.1/useful/part-m-00009*/file -> file. On EMR,
> > to mitigate hanging tasks, a given job may spawn duplicate tasks
> > (referenced by a trailing _0, _1, etc.). This then becomes a race
> > condition issue wrt duplicate tasks (_0, _1, etc.) committing to the same
> > bucket/path in S3.
> > In addition, you may also consider increasing the task timeout from 600s
> > something higher/lower to potentially timeout less/more (I think lowest
> > bound is 60000ms). I've had jobs which required a *two hour* timeout in
> > order to succeed. This can be done with a bootstrap, ie)
> > --bootstrap-action
> > s3://elasticmapreduce/bootstrap-actions/configure-hadoop
> > --args -m,mapred.task.timeout=2400000
> > As for the cleaning up of intermediate steps, I'm not sure. Possibly try
> > implementing EXEC
> > <https://pig.apache.org/docs/r0.11.1/cmds.html#exec>breakpoints prior
> > to problem blocks, but this will cause pig's job chaining
> > to weaken and the execution time to grow.
> > Hope this helps.
> > -Dan
> > On Wed, Jun 12, 2013 at 11:21 PM, Alan Crosswell <[EMAIL PROTECTED]>
> > wrote:
> > > Is this expected behavior or improper error recovery:
> > >
> > > *Task attempt_201306130117_0001_m_000009_0 failed to report status for
> > 602
> > > seconds. Killing!*
> > >
> > > This was then followed by the retries of the task failing due to the
> > > existence of the S3 output file that the dead task had started writing:
> > >
> > > *org.apache.pig.backend.executionengine.ExecException: ERROR 2081:
> > > to setup the store function.
> > > *
> > > *...*
> > > *Caused by: java.io.IOException: File already
> > > exists:s3n://n2ygk/reduced.1/useful/part-m-00009*
> > >
> > > Seems like this is exactly the kind of task restart that should "just
> > work"
> > > if the garbage from the failed task were properly cleaned up.
> > >
> > > Is there a way to tell Pig to just clobber output files?
> > >
> > > Is there a technique for checkpointing Pig scripts so that I don't have
> > to
> > > keep resubmitting this job and losing hours of work? I was even doing
> > > "STORE" of intermediate aliases so I could restart later, but the job
> > > failure causes the intermediate files to be deleted from S3.
> > >
> > > Thanks.
> > > /a
> > >