|
|
-
How do you load data from S3 on Amazon EMR with Pig 0.10.0?Russell Jurney 2012-06-22, 01:57
My script is simple:
/* Avro */ register /home/hadoop/pig-0.10.0/build/ivy/lib/Pig/avro-1.5.3.jar register /home/hadoop/pig-0.10.0/build/ivy/lib/Pig/json-simple-1.1.jar register /home/hadoop/pig-0.10.0/contrib/piggybank/java/piggybank.jar register /home/hadoop/pig-0.10.0/build/ivy/lib/Pig/jackson-core-asl-1.7.3.jar register /home/hadoop/pig-0.10.0/build/ivy/lib/Pig/jackson-mapper-asl-1.7.3.jar define AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage(); emails = LOAD 's3://rjurney_public_web/hadoop/enron.avro' using AvroStorage(); The error confuses me. Why can't I load data from s3? 2012-06-22 01:52:50,893 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. Invalid hostname in URI s3://rjurney_public_web/hadoop/enron.avro 2012-06-22 01:52:50,893 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.IllegalArgumentException: Invalid hostname in URI s3://rjurney_public_web/hadoop/enron.avro at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:436) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1327) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:65) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1345) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:244) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:70) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:53) at org.apache.pig.builtin.JsonMetadata.findMetaFile(JsonMetadata.java:106) at org.apache.pig.builtin.JsonMetadata.getSchema(JsonMetadata.java:188) at org.apache.pig.builtin.PigStorage.getSchema(PigStorage.java:466) at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151) at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110) at org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100) at org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:219) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) at org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1635) at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1566) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1538) at org.apache.pig.PigServer.registerQuery(PigServer.java:540) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:490) at org.apache.pig.Main.main(Main.java:111) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com |