Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How can I read Hive text files on S3 from Pig?


+
Martin Goodson 2012-10-12, 15:48
+
Dmitriy Ryaboy 2012-10-12, 17:56
+
Martin Goodson 2012-10-13, 09:53
Copy link to this message
-
Re: How can I read Hive text files on S3 from Pig?
Yeah that's a bug in FileLocalizer, apparently it assumes local or
hdfs, only. Could you file a jira?

D

On Sat, Oct 13, 2012 at 2:53 AM, Martin Goodson
<[EMAIL PROTECTED]> wrote:
> Hi Dmitriy,
> here's is the stack trace:
>
> java.lang.IllegalArgumentException: Wrong FS: s3n://xxx/yyy/zz/, expected:
> hdfs://namenode.adsf.companyname.com
>         at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:433)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:129)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:523)
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:820)
>         at
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:203)
>         at
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:131)
>         at
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:147)
>         at
> org.apache.pig.impl.io.FileLocalizer.fullPath(FileLocalizer.java:316)
>         at
> org.apache.pig.piggybank.storage.JsonMetadata.findMetaFile(JsonMetadata.java:94)
>         at
> org.apache.pig.piggybank.storage.JsonMetadata.getSchema(JsonMetadata.java:154)
>         at
> org.apache.pig.piggybank.storage.AllLoader.getSchema(AllLoader.java:400)
>         at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150)
>         at
> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109)
>         at
> org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100)
>         at
> org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218)
>         at
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>         at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>         at
> org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57)
>         at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679)
>         at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610)
>         at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582)
>         at org.apache.pig.PigServer.registerQuery(PigServer.java:584)
>         at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942)
>         at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
>         at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
>         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>         at org.apache.pig.Main.run(Main.java:495)
>         at org.apache.pig.Main.main(Main.java:111)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>
>
> Thanks for taking a look. I will start looking into HCatalog too.
>
> Martin
>
>
> On 12 October 2012 18:56, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
>
>> Martin,
>> Do you have the compete stack trace?
>> Generally, for Hive interop I recommend HCatalog; AllLoader is neat
>> but it's a 3rd party contrib and we don't really know it too well. I
>> can check out the error dump and see if there's anything obvious
>> though.
>>
>> D
>>
>> On Fri, Oct 12, 2012 at 8:48 AM, Martin Goodson
>> <[EMAIL PROTECTED]> wrote:
>> > I am trying to load some text files in hive partitions on S3 using the
>> > AllLoader function with no success. I get an error which indicates that
>> > AllLoader is expecting the files to be on hdfs:
+
Martin Goodson 2012-10-18, 12:22
+
Dmitriy Ryaboy 2012-10-18, 20:59
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB