Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - PigStorageSchema and S3 bug


Copy link to this message
-
Re: PigStorageSchema and S3 bug
Cheolsoo Park 2012-10-12, 23:16
Hi Meghana,

Are you sure that you're using Apache Pig version 0.10.0-cdh4.1.0? Because
a change was made to PigStorageSchema in Pig 0.10 (
https://issues.apache.org/jira/browse/PIG-2143), it is not possible to get
your call stack:

            at
org.apache.pig.piggybank.storage.PigStorageSchema.storeSchema(PigStorageSchema.java:152)

In Pig 0.10, PigStorageSchema.java is only 45-line long. I double checked
that Pig version 0.10.0-cdh4.1.0 includes PIG-2143. It looks like you're
using an old version of Pig something like 0.9.2-cdh4.0.1.

Thanks,
Cheolsoo

On Fri, Oct 12, 2012 at 1:50 PM, Meghana Narasimhan <
[EMAIL PROTECTED]> wrote:

> Hello,
>
> We are using PigStorageSchema to store our results on S3 with HDFS still
> as the file system and we are running into issues writing out the schema
> file to s3.
>
> We are just loading a CSV file using PigStorage, running through some
> basic Pig stuff and then storing it on S3 using PigStorageSchema. We are on
> Hadoop 2.0.0-cdh4.1.0 and Apache Pig version 0.10.0-cdh4.1.0.
>
>
> {code}
> A = LOAD 'input' USING PigStorage(',');
> B = FOREACH A GENERATE $0 AS A1, $1 AS A2, $2 AS A3;
> C = LIMIT B 3;
> STORE C INTO 's3n://XXX:XXX@bucket/outPigStorageSchema1' USING
> org.apache.pig.piggybank.storage.PigStorageSchema();
> {code}
>
> Pig logs :
>
> {code}
> 2012-10-11 21:00:56,193 [main] INFO
>  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
> script: LIMIT
> 2012-10-11 21:00:56,209 [main] INFO
>  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned
> for A: $3, $4, $5, $6
> 2012-10-11 21:00:56,250 [main] WARN
>  org.jets3t.service.impl.rest.httpclient.RestS3Service - Response
> '/Meg/outPigStorageSchema1' - Unexpected response code 404, expected 200
> 2012-10-11 21:00:57,174 [main] WARN
>  org.jets3t.service.impl.rest.httpclient.RestS3Service - Response
> '/Meg/outPigStorageSchema1_%24folder%24' - Unexpected response code 404,
> expected 200
> 2012-10-11 21:00:57,212 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler -
> File concatenation threshold: 100 optimistic? false
> 2012-10-11 21:00:57,218 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size before optimization: 1
> 2012-10-11 21:00:57,218 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size after optimization: 1
> 2012-10-11 21:00:57,221 [main] INFO
>  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added
> to the job
> 2012-10-11 21:00:57,221 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
> 2012-10-11 21:00:57,222 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - creating jar file Job7469072732967367765.jar
> 2012-10-11 21:01:02,810 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - jar file Job7469072732967367765.jar created
> 2012-10-11 21:01:02,815 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting up single store job2012-10-11 21:01:02,830 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 1 map-reduce job(s) waiting for submission.
> 2012-10-11 21:01:02,884 [Thread-64] WARN
>  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing
> the arguments. Applications should implement Tool for the same.
> 2012-10-11 21:01:03,256 [Thread-64] WARN
>  org.jets3t.service.impl.rest.httpclient.RestS3Service - Response
> '/Meg/outPigStorageSchema1' - Unexpected response code 404, expected 200
> 2012-10-11 21:01:03,332 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 0% complete
> 2012-10-11 21:01:03,502 [Thread-64] WARN