|
|
-
Re: PigStorageSchema and S3 bugCheolsoo Park 2012-10-12, 23:16
Hi Meghana,
Are you sure that you're using Apache Pig version 0.10.0-cdh4.1.0? Because a change was made to PigStorageSchema in Pig 0.10 ( https://issues.apache.org/jira/browse/PIG-2143), it is not possible to get your call stack: at org.apache.pig.piggybank.storage.PigStorageSchema.storeSchema(PigStorageSchema.java:152) In Pig 0.10, PigStorageSchema.java is only 45-line long. I double checked that Pig version 0.10.0-cdh4.1.0 includes PIG-2143. It looks like you're using an old version of Pig something like 0.9.2-cdh4.0.1. Thanks, Cheolsoo On Fri, Oct 12, 2012 at 1:50 PM, Meghana Narasimhan < [EMAIL PROTECTED]> wrote: > Hello, > > We are using PigStorageSchema to store our results on S3 with HDFS still > as the file system and we are running into issues writing out the schema > file to s3. > > We are just loading a CSV file using PigStorage, running through some > basic Pig stuff and then storing it on S3 using PigStorageSchema. We are on > Hadoop 2.0.0-cdh4.1.0 and Apache Pig version 0.10.0-cdh4.1.0. > > > {code} > A = LOAD 'input' USING PigStorage(','); > B = FOREACH A GENERATE $0 AS A1, $1 AS A2, $2 AS A3; > C = LIMIT B 3; > STORE C INTO 's3n://XXX:XXX@bucket/outPigStorageSchema1' USING > org.apache.pig.piggybank.storage.PigStorageSchema(); > {code} > > Pig logs : > > {code} > 2012-10-11 21:00:56,193 [main] INFO > org.apache.pig.tools.pigstats.ScriptState - Pig features used in the > script: LIMIT > 2012-10-11 21:00:56,209 [main] INFO > org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned > for A: $3, $4, $5, $6 > 2012-10-11 21:00:56,250 [main] WARN > org.jets3t.service.impl.rest.httpclient.RestS3Service - Response > '/Meg/outPigStorageSchema1' - Unexpected response code 404, expected 200 > 2012-10-11 21:00:57,174 [main] WARN > org.jets3t.service.impl.rest.httpclient.RestS3Service - Response > '/Meg/outPigStorageSchema1_%24folder%24' - Unexpected response code 404, > expected 200 > 2012-10-11 21:00:57,212 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - > File concatenation threshold: 100 optimistic? false > 2012-10-11 21:00:57,218 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer > - MR plan size before optimization: 1 > 2012-10-11 21:00:57,218 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer > - MR plan size after optimization: 1 > 2012-10-11 21:00:57,221 [main] INFO > org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added > to the job > 2012-10-11 21:00:57,221 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 > 2012-10-11 21:00:57,222 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - creating jar file Job7469072732967367765.jar > 2012-10-11 21:01:02,810 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - jar file Job7469072732967367765.jar created > 2012-10-11 21:01:02,815 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler > - Setting up single store job2012-10-11 21:01:02,830 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 1 map-reduce job(s) waiting for submission. > 2012-10-11 21:01:02,884 [Thread-64] WARN > org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing > the arguments. Applications should implement Tool for the same. > 2012-10-11 21:01:03,256 [Thread-64] WARN > org.jets3t.service.impl.rest.httpclient.RestS3Service - Response > '/Meg/outPigStorageSchema1' - Unexpected response code 404, expected 200 > 2012-10-11 21:01:03,332 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - 0% complete > 2012-10-11 21:01:03,502 [Thread-64] WARN |