Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Pig 0.9.2 and avro on S3


+
meghana narasimhan 2012-11-30, 20:59
+
meghana narasimhan 2012-11-30, 21:05
+
William Oberman 2012-11-30, 21:09
+
William Oberman 2012-11-30, 21:11
Copy link to this message
-
Re: Pig 0.9.2 and avro on S3
Ah thanks William. I am trying it with an upgraded piggybank
of 0.10.0-cdh4.1.1. It seems to be running , else pig 10 would be the way
to go.
On Fri, Nov 30, 2012 at 1:11 PM, William Oberman
<[EMAIL PROTECTED]>wrote:

> I should have read more closely, you're not using EMR.
>
> I'm guessing if you upgrade to pig 0.10 the issue will go away...
>
>
> On Fri, Nov 30, 2012 at 4:09 PM, William Oberman
> <[EMAIL PROTECTED]>wrote:
>
> > A couple of weeks ago I spent a bunch of time trying to get EMR + S3 +
> > Avro working:
> > https://forums.aws.amazon.com/thread.jspa?messageID=398194񡍲
> >
> > Short story, yes I think PIG-2540 is the issue.  I'm currently trying to
> > get pig 0.10 running in EMR with help from AWS support.   You have to do:
> > --bootstrap-action s3://elasticmapreduce/bootstrap-actions/run-if --args
> > "instance.isMaster=true,s3://yourbucket/path/install_pig_0.10.0.sh"
> >
> > install_pig_0.10.0.sh contents:
> > ---------------------
> > #!/usr/bin/env bash
> > cd /home/hadoop
> > wget http://apache.mirrors.hoobly.com/pig/pig-0.10.0/pig-0.10.0.tar.gz
> > tar zxf pig-0.10.0.tar.gz
> > mv pig-0.10.0 pig
> > echo "export HADOOP_HOME=/home/hadoop" >> ~/.bashrc
> > echo "export PATH=/home/hadoop/pig/bin/:\$PATH" >> ~/.bashrc
> > cd pig
> > ant
> > cd contrib/piggybank/java
> > ant
> > cp piggybank.jar /home/hadoop/lib/.
> > cd /home/hadoop/lib
> > wget "http://json-simple.googlecode.com/files/json_simple-1.1.jar"
> > ------------------
> >
> > But note, I have NOT got around to testing this yet!   If you do, and it
> > works let me know :-)
> >
> > will
> >
> > On Fri, Nov 30, 2012 at 4:05 PM, meghana narasimhan <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Oh I should also mention piggybank : 0.9.2-cdh4.0.1
> >>
> >>
> >> On Fri, Nov 30, 2012 at 12:59 PM, meghana narasimhan <
> >> [EMAIL PROTECTED]> wrote:
> >>
> >> > Hi all,
> >> >
> >> > Is this bug https://issues.apache.org/jira/browse/PIG-2540 applicable
> >> to
> >> > plain ec2 instances as well. I seem to have hit a snag with Apache Pig
> >> > version 0.9.2-cdh4.0.1 (rexported) and avro files on S3. My hadoop
> >> cluster
> >> > is made of Amazon ec2 instances.
> >> >
> >> > Here is my load statement :
> >> >
> >> > dimRad = LOAD 's3n://credentials@bucket
> >> /dimensions/2012/11/29/20121129-000159123456/dim'
> >> > USING
> >> >   AVRO_STORAGE AS
> >> >    (a:int
> >> >   , b:chararray
> >> >   );
> >> >
> >> > and it gives me a :
> >> >
> >> > 2012-11-30 20:42:44,205 [main] ERROR org.apache.pig.tools.grunt.Grunt
> -
> >> > ERROR 1200: Wrong FS: s3n://credentials@bucket
> >> /dimensions/2012/11/29/20121129-000159123456/dim,
> >> > expected: hdfs://ec2-1xxxx.compute-1.amazonaws.com:8020
> >> >
> >> >
> >> > Thanks,
> >> > Meg
> >> >
> >>
> >
> >
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB