Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How can I read Hive text files on S3 from Pig?


Copy link to this message
-
Re: How can I read Hive text files on S3 from Pig?
Hi Dmitriy,
here's is the stack trace:

java.lang.IllegalArgumentException: Wrong FS: s3n://xxx/yyy/zz/, expected:
hdfs://namenode.adsf.companyname.com
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:433)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:129)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:523)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:820)
        at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:203)
        at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:131)
        at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:147)
        at
org.apache.pig.impl.io.FileLocalizer.fullPath(FileLocalizer.java:316)
        at
org.apache.pig.piggybank.storage.JsonMetadata.findMetaFile(JsonMetadata.java:94)
        at
org.apache.pig.piggybank.storage.JsonMetadata.getSchema(JsonMetadata.java:154)
        at
org.apache.pig.piggybank.storage.AllLoader.getSchema(AllLoader.java:400)
        at
org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150)
        at
org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109)
        at
org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100)
        at
org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218)
        at
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
        at
org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57)
        at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679)
        at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610)
        at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582)
        at org.apache.pig.PigServer.registerQuery(PigServer.java:584)
        at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942)
        at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
        at org.apache.pig.Main.run(Main.java:495)
        at org.apache.pig.Main.main(Main.java:111)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Thanks for taking a look. I will start looking into HCatalog too.

Martin
On 12 October 2012 18:56, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:

> Martin,
> Do you have the compete stack trace?
> Generally, for Hive interop I recommend HCatalog; AllLoader is neat
> but it's a 3rd party contrib and we don't really know it too well. I
> can check out the error dump and see if there's anything obvious
> though.
>
> D
>
> On Fri, Oct 12, 2012 at 8:48 AM, Martin Goodson
> <[EMAIL PROTECTED]> wrote:
> > I am trying to load some text files in hive partitions on S3 using the
> > AllLoader function with no success. I get an error which indicates that
> > AllLoader is expecting the files to be on hdfs:
> >
> > a = LOAD 's3n://xxxxx/yyyyy/zzz' using
> > org.apache.pig.piggybank.storage.AllLoader();
> > grunt> 2012-10-12 14:51:26,229 [main] ERROR
> > org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error.
> > Wrong FS: s3n://xxxxx/yyyyy/zzz, expected: hdfs://
> > namenode.hadoop.companyname.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB