|
|
-
FileAlreadyExistsException while running pig
Haitao Yao 2012-08-10, 02:42
hi, all I got this while running pig script:
997: Unable to recreate exception from backend error: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 already exists at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:188) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:893) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:856) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:830) at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) at java.lang.Thread.run(Thread.java:722) But I checked the script , the directory: hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 is not used by the script explicitly, so I think it is used by the pig to store tmp results. But why it exists? Isn't it unique?
Haitao Yao [EMAIL PROTECTED] weibo: @haitao_yao Skype: haitao.yao.final
-
Re: FileAlreadyExistsException while running pig
Alan Gates 2012-08-10, 17:48
Usually that means the the directory you are trying to store to already exists. Pig won't overwrite existing data. You should either move or remove the directory or change the directory name in your store function.
Alan.
On Aug 9, 2012, at 7:42 PM, Haitao Yao wrote:
> hi, all > I got this while running pig script: > > 997: Unable to recreate exception from backend error: > org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 already exists > at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) > at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207) > at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:188) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:893) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:856) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:830) > at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) > at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) > at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) > at java.lang.Thread.run(Thread.java:722) > > > But I checked the script , the directory: hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 is not used by the script explicitly, so I think it is used by the pig to store tmp results. > But why it exists? Isn't it unique? > > > > > > > > > Haitao Yao > [EMAIL PROTECTED] > weibo: @haitao_yao > Skype: haitao.yao.final >
-
Re: FileAlreadyExistsException while running pig
Mohammad Tariq 2012-08-10, 18:51
Hello Haitao,
Each time we run a MapReduce job, the job expects the output to be non-existent. If the output path is already there then FileAlreadyExists exception is thrown. And as we know that each Pig job is eventually a MapReduce job, it also expects the same.
Regards, Mohammad Tariq On Fri, Aug 10, 2012 at 11:18 PM, Alan Gates <[EMAIL PROTECTED]> wrote: > Usually that means the the directory you are trying to store to already exists. Pig won't overwrite existing data. You should either move or remove the directory or change the directory name in your store function. > > Alan. > > On Aug 9, 2012, at 7:42 PM, Haitao Yao wrote: > >> hi, all >> I got this while running pig script: >> >> 997: Unable to recreate exception from backend error: >> org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 already exists >> at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecsHelper(PigOutputFormat.java:207) >> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.checkOutputSpecs(PigOutputFormat.java:188) >> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:893) >> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) >> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:856) >> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:830) >> at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378) >> at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) >> at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) >> at java.lang.Thread.run(Thread.java:722) >> >> >> But I checked the script , the directory: hdfs://DC-hadoop01:9000/tmp/pig-temp/temp548500412/tmp-1456742965 is not used by the script explicitly, so I think it is used by the pig to store tmp results. >> But why it exists? Isn't it unique? >> >> >> >> >> >> >> >> >> Haitao Yao >> [EMAIL PROTECTED] >> weibo: @haitao_yao >> Skype: haitao.yao.final >> >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext