Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Re: Pig JasonParser


Copy link to this message
-
Re: Pig JasonParser
We are running EB in production with Pig 0.11 against CDH3.
Hadoop 2 is a different story -- lots of things need to change to have that
work. Raghu has a branch that makes EB changes:
https://github.com/rangadi/elephant-bird/tree/hadoop-2.0-support
On Thu, Apr 4, 2013 at 6:39 PM, Ruslan Al-Fakikh <[EMAIL PROTECTED]>wrote:

> Hi guys,
>
> As for elephant-bird, it seems that it is not compatible with Pig 0.10
> (CDH4) :(
> I am using this configuration:
> pig -version
> Apache Pig version 0.10.0-cdh4.1.1 (rexported)
> hadoop version
> Hadoop 2.0.0-cdh4.1.1
> and getting just the same error as Tim explained:
> java.lang.IncompatibleClassChangeError: Found interface
> org.apache.hadoop.mapreduce.Counter, but class was expected
>
> I am running it with the following commands:
> REGISTER elephant-bird-pig-3.0.2.jar;
> inputData = LOAD 'sample_simple.json' USING
> com.twitter.elephantbird.pig.load.JsonLoader() as (json:map[]);
> DUMP inputData;
>
>
> On Thu, Sep 27, 2012 at 8:48 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> wrote:
>
> > Yep. It's just JsonLoader.
> > By default it works on top of whatever's returned by TexInputFormat, but
> > you can override that, as long as the input format returns a string
> that's
> > valid json, we are cool (so in theory you could write a
> > TwitterAPIInputFormat or something, and get the json in Pig, not that I
> > would recommend that).
> >
> > D
> >
> > On Wed, Sep 26, 2012 at 9:34 PM, Russell Jurney <
> [EMAIL PROTECTED]
> > >wrote:
> >
> > > Does that work without lzo?
> > >
> > > Russell Jurney http://datasyndrome.com
> > >
> > > On Sep 26, 2012, at 9:00 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]>
> wrote:
> > >
> > > > Try asking Michael May on gihub? This seems to be an issue with his
> > > Loader..
> > > >
> > > > The JsonLoader in ElephantBird should work in this case if you turn
> on
> > > > nested parsing (
> > > >
> > >
> >
> https://github.com/kevinweil/elephant-bird/blob/master/pig/src/main/java/com/twitter/elephantbird/pig/load/JsonLoader.java
> > > > )
> > > >
> > > > D
> > > >
> > > > On Wed, Sep 26, 2012 at 2:31 PM, Deepak Tiwari <[EMAIL PROTECTED]
> >
> > > wrote:
> > > >
> > > >> My bad.. I think I have compiled from
> > > >>
> https://github.com/mmay/PigJsonLoader/blob/master/JsonLoader.javalong
> > > >> time
> > > >> back in my piggybank area..it indeed didnt come with the original
> > jar...
> > > >>
> > > >> Regards,
> > > >>
> > > >> Deepak
> > > >>
> > > >> On Tue, Sep 25, 2012 at 8:14 AM, Bill Graham <[EMAIL PROTECTED]>
> > > wrote:
> > > >>
> > > >>> I missed the part about Piggybank, but I'm confused because I don't
> > see
> > > >>> that class in SVN:
> > > >>>
> > > >>>
> > > >>
> > >
> >
> http://svn.apache.org/viewvc/pig/branches/branch-0.10/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/
> > > >>>
> > > >>> Either way your error seems to be issues with parsing the doubles.
> > > >>>
> > > >>>
> > > >>> On Mon, Sep 24, 2012 at 2:24 PM, Vivek Shrivastava <
> > > >>> [EMAIL PROTECTED]
> > > >>>> wrote:
> > > >>>
> > > >>>> Thanks for responding Bill, However I am using JsonLoader that is
> in
> > > >> the
> > > >>>> Piggybank with Pig-0.10.0.
> > > >>>>
> > > >>>> It doesnt need any schema and converts Json data as map (
> > > >>>> org.apache.pig.piggybank.storage.JsonLoader() as (json:map[]) )
> and
> > I
> > > >>>> extract data from there using keys. I have processed huge amount
> of
> > > >> data
> > > >>>> without any problem and no schema was required.
> > > >>>>
> > > >>>> Regards,
> > > >>>>
> > > >>>> Vivek
> > > >>>>
> > > >>>> On Mon, Sep 24, 2012 at 2:03 PM, Bill Graham <
> [EMAIL PROTECTED]>
> > > >>> wrote:
> > > >>>>
> > > >>>>> This loader only works for data stored using JsonStorage. From
> the
> > > >>>>> javadocs:
> > > >>>>>
> > > >>>>> A loader for data stored using
> > > >>>>> JsonStorage<
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/JsonStorage.html