Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> FileAlreadyExistsException while running pig


+
Haitao Yao 2012-08-10, 02:42
Copy link to this message
-
Re: FileAlreadyExistsException while running pig
Usually that means the the directory you are trying to store to already exists.  Pig won't overwrite existing data.  You should either move or remove the directory or change the directory name in your store function.

Alan.

On Aug 9, 2012, at 7:42 PM, Haitao Yao wrote:

> hi, all
> I got this while running pig script:
>
> 997: Unable to recreate exception from backend error:
> org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 already exists
>        at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:188)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:893)
>        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:415)
>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:856)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:830)
>        at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
>        at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
>        at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
>        at java.lang.Thread.run(Thread.java:722)
>
>
> But I checked the script , the directory:  hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 is not used by the script explicitly, so I think it is used by the pig to store tmp results.
> But why it exists? Isn't it unique?
>
>
>
>
>
>
>
>
> Haitao Yao
> [EMAIL PROTECTED]
> weibo: @haitao_yao
> Skype:  haitao.yao.final
>
+
Mohammad Tariq 2012-08-10, 18:51