Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How can I read Hive text files on S3 from Pig?

Copy link to this message
Re: How can I read Hive text files on S3 from Pig?
Do you have the compete stack trace?
Generally, for Hive interop I recommend HCatalog; AllLoader is neat
but it's a 3rd party contrib and we don't really know it too well. I
can check out the error dump and see if there's anything obvious


On Fri, Oct 12, 2012 at 8:48 AM, Martin Goodson
> I am trying to load some text files in hive partitions on S3 using the
> AllLoader function with no success. I get an error which indicates that
> AllLoader is expecting the files to be on hdfs:
> a = LOAD 's3n://xxxxx/yyyyy/zzz' using
> org.apache.pig.piggybank.storage.AllLoader();
> grunt> 2012-10-12 14:51:26,229 [main] ERROR
> org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error.
> Wrong FS: s3n://xxxxx/yyyyy/zzz, expected: hdfs://
> namenode.hadoop.companyname.com
> reading the files with pig storage works fine but PigStorage is not aware
> of the Hive partition structure so I cannot query the data using this
> method (I have to specify the file manually):
> a = LOAD 's3n://xxxxx/yyyyy/zzzZ' using PigStorage();
> Is there  a way of reading hive partitions from pig over S3?
> hive-0.9.0
> pig-0.10.0
> hadoop-0.20
> Thank you
> Martin