Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How can I read Hive text files on S3 from Pig?


+
Martin Goodson 2012-10-12, 15:48
+
Dmitriy Ryaboy 2012-10-12, 17:56
+
Martin Goodson 2012-10-13, 09:53
+
Dmitriy Ryaboy 2012-10-18, 04:15
+
Martin Goodson 2012-10-18, 12:22
Copy link to this message
-
Re: How can I read Hive text files on S3 from Pig?
The same underlying class is used by PigStorage in 11, so we should
clean this up to make S3 users happy.

D

On Thu, Oct 18, 2012 at 5:22 AM, Martin Goodson
<[EMAIL PROTECTED]> wrote:
> Sure - thanks for having a look. By the way,  I've moved to HCatalog and
> things look they are working.
> Thanks again
> Martin
>
> On 18 October 2012 05:15, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
>
>> Yeah that's a bug in FileLocalizer, apparently it assumes local or
>> hdfs, only. Could you file a jira?
>>
>> D
>>
>> On Sat, Oct 13, 2012 at 2:53 AM, Martin Goodson
>> <[EMAIL PROTECTED]> wrote:
>> > Hi Dmitriy,
>> > here's is the stack trace:
>> >
>> > java.lang.IllegalArgumentException: Wrong FS: s3n://xxx/yyy/zz/,
>> expected:
>> > hdfs://namenode.adsf.companyname.com
>> >         at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:433)
>> >         at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:129)
>> >         at
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:523)
>> >         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:820)
>> >         at
>> >
>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:203)
>> >         at
>> >
>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:131)
>> >         at
>> >
>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:147)
>> >         at
>> > org.apache.pig.impl.io.FileLocalizer.fullPath(FileLocalizer.java:316)
>> >         at
>> >
>> org.apache.pig.piggybank.storage.JsonMetadata.findMetaFile(JsonMetadata.java:94)
>> >         at
>> >
>> org.apache.pig.piggybank.storage.JsonMetadata.getSchema(JsonMetadata.java:154)
>> >         at
>> > org.apache.pig.piggybank.storage.AllLoader.getSchema(AllLoader.java:400)
>> >         at
>> >
>> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150)
>> >         at
>> >
>> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109)
>> >         at
>> >
>> org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100)
>> >         at
>> > org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218)
>> >         at
>> >
>> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>> >         at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>> >         at
>> >
>> org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57)
>> >         at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679)
>> >         at
>> org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610)
>> >         at
>> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582)
>> >         at org.apache.pig.PigServer.registerQuery(PigServer.java:584)
>> >         at
>> > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942)
>> >         at
>> >
>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
>> >         at
>> >
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
>> >         at
>> >
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
>> >         at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>> >         at org.apache.pig.Main.run(Main.java:495)
>> >         at org.apache.pig.Main.main(Main.java:111)
>> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >         at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> >         at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >         at java.lang.reflect.Method.invoke(Method.java:601)
>> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>> >
>> >
>> > Thanks for taking a look. I will start looking into HCatalog too.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB