Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How can I read Hive text files on S3 from Pig?

Copy link to this message
How can I read Hive text files on S3 from Pig?
I am trying to load some text files in hive partitions on S3 using the
AllLoader function with no success. I get an error which indicates that
AllLoader is expecting the files to be on hdfs:

a = LOAD 's3n://xxxxx/yyyyy/zzz' using
grunt> 2012-10-12 14:51:26,229 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error.
Wrong FS: s3n://xxxxx/yyyyy/zzz, expected: hdfs://
reading the files with pig storage works fine but PigStorage is not aware
of the Hive partition structure so I cannot query the data using this
method (I have to specify the file manually):

a = LOAD 's3n://xxxxx/yyyyy/zzzZ' using PigStorage();

Is there  a way of reading hive partitions from pig over S3?

Thank you
Dmitriy Ryaboy 2012-10-12, 17:56
Martin Goodson 2012-10-13, 09:53
Dmitriy Ryaboy 2012-10-18, 04:15
Martin Goodson 2012-10-18, 12:22
Dmitriy Ryaboy 2012-10-18, 20:59