Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How do you load data from S3 on Amazon EMR with Pig 0.10.0?


Copy link to this message
-
How do you load data from S3 on Amazon EMR with Pig 0.10.0?
My script is simple:

/* Avro */
register /home/hadoop/pig-0.10.0/build/ivy/lib/Pig/avro-1.5.3.jar
register /home/hadoop/pig-0.10.0/build/ivy/lib/Pig/json-simple-1.1.jar
register /home/hadoop/pig-0.10.0/contrib/piggybank/java/piggybank.jar
register
/home/hadoop/pig-0.10.0/build/ivy/lib/Pig/jackson-core-asl-1.7.3.jar
register
/home/hadoop/pig-0.10.0/build/ivy/lib/Pig/jackson-mapper-asl-1.7.3.jar

define AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage();

emails = LOAD 's3://rjurney_public_web/hadoop/enron.avro' using
AvroStorage();
The error confuses me. Why can't I load data from s3?

2012-06-22 01:52:50,893 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2999: Unexpected internal error. Invalid hostname in URI
s3://rjurney_public_web/hadoop/enron.avro
2012-06-22 01:52:50,893 [main] ERROR org.apache.pig.tools.grunt.Grunt -
java.lang.IllegalArgumentException: Invalid hostname in URI
s3://rjurney_public_web/hadoop/enron.avro
at org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:436)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1327)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:65)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1345)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:244)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:70)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:53)
at org.apache.pig.builtin.JsonMetadata.findMetaFile(JsonMetadata.java:106)
at org.apache.pig.builtin.JsonMetadata.getSchema(JsonMetadata.java:188)
at org.apache.pig.builtin.PigStorage.getSchema(PigStorage.java:466)
at
org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:151)
at
org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:110)
at
org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100)
at org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:219)
at
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at
org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1635)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1566)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1538)
at org.apache.pig.PigServer.registerQuery(PigServer.java:540)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:490)
at org.apache.pig.Main.main(Main.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com