Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - How can I read Hive text files on S3 from Pig?


+
Martin Goodson 2012-10-12, 15:48
Copy link to this message
-
Re: How can I read Hive text files on S3 from Pig?
Dmitriy Ryaboy 2012-10-12, 17:56
Martin,
Do you have the compete stack trace?
Generally, for Hive interop I recommend HCatalog; AllLoader is neat
but it's a 3rd party contrib and we don't really know it too well. I
can check out the error dump and see if there's anything obvious
though.

D

On Fri, Oct 12, 2012 at 8:48 AM, Martin Goodson
<[EMAIL PROTECTED]> wrote:
> I am trying to load some text files in hive partitions on S3 using the
> AllLoader function with no success. I get an error which indicates that
> AllLoader is expecting the files to be on hdfs:
>
> a = LOAD 's3n://xxxxx/yyyyy/zzz' using
> org.apache.pig.piggybank.storage.AllLoader();
> grunt> 2012-10-12 14:51:26,229 [main] ERROR
> org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error.
> Wrong FS: s3n://xxxxx/yyyyy/zzz, expected: hdfs://
> namenode.hadoop.companyname.com
>
>
> reading the files with pig storage works fine but PigStorage is not aware
> of the Hive partition structure so I cannot query the data using this
> method (I have to specify the file manually):
>
> a = LOAD 's3n://xxxxx/yyyyy/zzzZ' using PigStorage();
>
> Is there  a way of reading hive partitions from pig over S3?
>
> hive-0.9.0
> pig-0.10.0
> hadoop-0.20
>
>
> Thank you
> Martin
+
Martin Goodson 2012-10-13, 09:53
+
Dmitriy Ryaboy 2012-10-18, 04:15
+
Martin Goodson 2012-10-18, 12:22
+
Dmitriy Ryaboy 2012-10-18, 20:59