Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Pig 0.9.2 and avro on S3


+
meghana narasimhan 2012-11-30, 20:59
+
meghana narasimhan 2012-11-30, 21:05
Copy link to this message
-
Re: Pig 0.9.2 and avro on S3
William Oberman 2012-11-30, 21:09
A couple of weeks ago I spent a bunch of time trying to get EMR + S3 + Avro
working:
https://forums.aws.amazon.com/thread.jspa?messageID=398194񡍲

Short story, yes I think PIG-2540 is the issue.  I'm currently trying to
get pig 0.10 running in EMR with help from AWS support.   You have to do:
--bootstrap-action s3://elasticmapreduce/bootstrap-actions/run-if --args
"instance.isMaster=true,s3://yourbucket/path/install_pig_0.10.0.sh"

install_pig_0.10.0.sh contents:
---------------------
#!/usr/bin/env bash
cd /home/hadoop
wget http://apache.mirrors.hoobly.com/pig/pig-0.10.0/pig-0.10.0.tar.gz
tar zxf pig-0.10.0.tar.gz
mv pig-0.10.0 pig
echo "export HADOOP_HOME=/home/hadoop" >> ~/.bashrc
echo "export PATH=/home/hadoop/pig/bin/:\$PATH" >> ~/.bashrc
cd pig
ant
cd contrib/piggybank/java
ant
cp piggybank.jar /home/hadoop/lib/.
cd /home/hadoop/lib
wget "http://json-simple.googlecode.com/files/json_simple-1.1.jar"
------------------

But note, I have NOT got around to testing this yet!   If you do, and it
works let me know :-)

will

On Fri, Nov 30, 2012 at 4:05 PM, meghana narasimhan <
[EMAIL PROTECTED]> wrote:

> Oh I should also mention piggybank : 0.9.2-cdh4.0.1
>
>
> On Fri, Nov 30, 2012 at 12:59 PM, meghana narasimhan <
> [EMAIL PROTECTED]> wrote:
>
> > Hi all,
> >
> > Is this bug https://issues.apache.org/jira/browse/PIG-2540 applicable to
> > plain ec2 instances as well. I seem to have hit a snag with Apache Pig
> > version 0.9.2-cdh4.0.1 (rexported) and avro files on S3. My hadoop
> cluster
> > is made of Amazon ec2 instances.
> >
> > Here is my load statement :
> >
> > dimRad = LOAD 's3n://credentials@bucket
> /dimensions/2012/11/29/20121129-000159123456/dim'
> > USING
> >   AVRO_STORAGE AS
> >    (a:int
> >   , b:chararray
> >   );
> >
> > and it gives me a :
> >
> > 2012-11-30 20:42:44,205 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 1200: Wrong FS: s3n://credentials@bucket
> /dimensions/2012/11/29/20121129-000159123456/dim,
> > expected: hdfs://ec2-1xxxx.compute-1.amazonaws.com:8020
> >
> >
> > Thanks,
> > Meg
> >
>
+
William Oberman 2012-11-30, 21:11
+
meghana narasimhan 2012-11-30, 21:15