Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - How can I read Hive text files on S3 from Pig?


Copy link to this message
-
How can I read Hive text files on S3 from Pig?
Martin Goodson 2012-10-12, 15:48
I am trying to load some text files in hive partitions on S3 using the
AllLoader function with no success. I get an error which indicates that
AllLoader is expecting the files to be on hdfs:

a = LOAD 's3n://xxxxx/yyyyy/zzz' using
org.apache.pig.piggybank.storage.AllLoader();
grunt> 2012-10-12 14:51:26,229 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error.
Wrong FS: s3n://xxxxx/yyyyy/zzz, expected: hdfs://
namenode.hadoop.companyname.com
reading the files with pig storage works fine but PigStorage is not aware
of the Hive partition structure so I cannot query the data using this
method (I have to specify the file manually):

a = LOAD 's3n://xxxxx/yyyyy/zzzZ' using PigStorage();

Is there  a way of reading hive partitions from pig over S3?

hive-0.9.0
pig-0.10.0
hadoop-0.20
Thank you
Martin