Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Re: Pig JasonParser


+
Ruslan Al-Fakikh 2013-04-05, 01:39
Copy link to this message
-
Re: Pig JasonParser
We are running EB in production with Pig 0.11 against CDH3.
Hadoop 2 is a different story -- lots of things need to change to have that
work. Raghu has a branch that makes EB changes:
https://github.com/rangadi/elephant-bird/tree/hadoop-2.0-support
On Thu, Apr 4, 2013 at 6:39 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]>wrote:

> Hi guys,
>
> As for elephant-bird, it seems that it is not compatible with Pig 0.10
> (CDH4) :(
> I am using this configuration:
> pig -version
> Apache Pig version 0.10.0-cdh4.1.1 (rexported)
> hadoop version
> Hadoop 2.0.0-cdh4.1.1
> and getting just the same error as Tim explained:
> java.lang.IncompatibleClassChangeError: Found interface
> org.apache.hadoop.mapreduce.Counter, but class was expected
>
> I am running it with the following commands:
> REGISTER elephant-bird-pig-3.0.2.jar;
> inputData = LOAD 'sample_simple.json' USING
> com.twitter.elephantbird.pig.load.JsonLoader() as (json:map[]);
> DUMP inputData;
>
>
> On Thu, Sep 27, 2012 at 8:48 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> wrote:
>
> > Yep. It's just JsonLoader.
> > By default it works on top of whatever's returned by TexInputFormat, but
> > you can override that, as long as the input format returns a string
> that's
> > valid json, we are cool (so in theory you could write a
> > TwitterAPIInputFormat or something, and get the json in Pig, not that I
> > would recommend that).
> >
> > D
> >
> > On Wed, Sep 26, 2012 at 9:34 PM, Russell Jurney <
> [EMAIL PROTECTED]
> > >wrote:
> >
> > > Does that work without lzo?
> > >
> > > Russell Jurney http://datasyndrome.com
> > >
> > > On Sep 26, 2012, at 9:00 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Try asking Michael May on gihub? This seems to be an issue with his
> > > Loader..
> > > >
> > > > The JsonLoader in ElephantBird should work in this case if you turn
> on
> > > > nested parsing (
> > > >
> > >
> >
> https://github.com/kevinweil/elephant-bird/blob/master/pig/src/main/java/com/twitter/elephantbird/pig/load/JsonLoader.java
> > > > )
> > > >
> > > > D
> > > >
> > > > On Wed, Sep 26, 2012 at 2:31 PM, Deepak Tiwari <[EMAIL PROTECTED]
> >
> > > wrote:
> > > >
> > > >> My bad.. I think I have compiled from
> > > >>
> https://github.com/mmay/PigJsonLoader/blob/master/JsonLoader.javalong
> > > >> time
> > > >> back in my piggybank area..it indeed didnt come with the original
> > jar...
> > > >>
> > > >> Regards,
> > > >>
> > > >> Deepak
> > > >>
> > > >> On Tue, Sep 25, 2012 at 8:14 AM, Bill Graham <[EMAIL PROTECTED]>
> > > wrote:
> > > >>
> > > >>> I missed the part about Piggybank, but I'm confused because I don't
> > see
> > > >>> that class in SVN:
> > > >>>
> > > >>>
> > > >>
> > >
> >
> http://svn.apache.org/viewvc/pig/branches/branch-0.10/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/
> > > >>>
> > > >>> Either way your error seems to be issues with parsing the doubles.
> > > >>>
> > > >>>
> > > >>> On Mon, Sep 24, 2012 at 2:24 PM, Vivek Shrivastava <
> > > >>> [EMAIL PROTECTED]
> > > >>>> wrote:
> > > >>>
> > > >>>> Thanks for responding Bill, However I am using JsonLoader that is
> in
> > > >> the
> > > >>>> Piggybank with Pig-0.10.0.
> > > >>>>
> > > >>>> It doesnt need any schema and converts Json data as map (
> > > >>>> org.apache.pig.piggybank.storage.JsonLoader() as (json:map[]) )
> and
> > I
> > > >>>> extract data from there using keys. I have processed huge amount
> of
> > > >> data
> > > >>>> without any problem and no schema was required.
> > > >>>>
> > > >>>> Regards,
> > > >>>>
> > > >>>> Vivek
> > > >>>>
> > > >>>> On Mon, Sep 24, 2012 at 2:03 PM, Bill Graham <
> [EMAIL PROTECTED]>
> > > >>> wrote:
> > > >>>>
> > > >>>>> This loader only works for data stored using JsonStorage. From
> the
> > > >>>>> javadocs:
> > > >>>>>
> > > >>>>> A loader for data stored using
> > > >>>>> JsonStorage<
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/JsonStorage.html
+
Ruslan Al-Fakikh 2013-04-09, 10:53
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB