Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How do I load JSON in Pig?


+
Russell Jurney 2012-11-17, 22:09
+
Dan Young 2012-11-18, 01:23
+
Arian Pasquali 2012-11-18, 02:30
+
Russell Jurney 2012-11-18, 04:32
+
Russell Jurney 2012-11-18, 17:19
+
Arian Pasquali 2012-11-18, 22:46
+
Arian Pasquali 2012-11-19, 00:31
Copy link to this message
-
Re: How do I load JSON in Pig?
It seems that everyone can build elephant-bird but me:
https://github.com/kevinweil/elephant-bird/issues/272
On Sun, Nov 18, 2012 at 7:31 PM, Arian Pasquali <[EMAIL PROTECTED]>wrote:

> I dont think you really need to build it.
> you can find it at any maven repository.
>
> Arian Rodrigo Pasquali
> FEUP, SAPO Labs
> http://www.arianpasquali.com
> twitter @arianpasquali
>
>
>
> 2012/11/18 Arian Pasquali <[EMAIL PROTECTED]>
>
> > U dont need to build neither
> > Just download those two jar I used in my example.
> >
> > Arian
> >
> > Em domingo, 18 de novembro de 2012, Russell Jurney escreveu:
> >
> >> Thanks - looks like I don't have to specify the schema, which is good.
> >>
> >> I'll try and build elephant-bird.
> >>
> >> Russell Jurney http://datasyndrome.com
> >>
> >> On Nov 17, 2012, at 9:30 PM, Arian Pasquali <[EMAIL PROTECTED]>
> >> wrote:
> >>
> >> > keep calm
> >> > and use elephant-bird
> >> > https://github.com/kevinweil/elephant-bird<
> >>
> https://github.com/kevinweil/elephant-bird/blob/master/pig/src/main/java/com/twitter/elephantbird/pig/load/JsonLoader.java
> >> >
> >> >
> >> > I posted here yesterday an example how to load tweets in json
> >> > here goes again. I hope it helps.
> >> >
> >> >  register 'elephant-bird-core-3.0.0.jar'
> >> >    register 'elephant-bird-pig-3.0.0.jar'
> >> >    register 'google-collections-1.0.jar'
> >> >    register 'json-simple-1.1.jar'
> >> >
> >> >    json_lines = LOAD
> >> > '/twitter_data/tweets/stream/v1/json/2012_10_10/08' USING
> >> > com.twitter.elephantbird.pig.load.JsonLoader();
> >> >
> >> >    geo_tweets = FOREACH json_lines GENERATE (CHARARRAY) $0#'id' AS
> >> > id, (CHARARRAY) $0#'geoLocation' AS geoLocation;
> >> >
> >> >    only_not_nulls = FILTER geo_tweets BY geoLocation is not null;
> >> >    store only_not_nulls into '/twitter_data/results/geo_tweets';
> >> >
> >> >
> >> >
> >> > Arian Rodrigo Pasquali
> >> > FEUP, SAPO Labs
> >> > http://www.arianpasquali.com
> >> > twitter @arianpasquali
> >> >
> >> >
> >> >
> >> > 2012/11/18 Dan Young <[EMAIL PROTECTED]>
> >> >
> >> >> No sure if this helps, but in 0.11 I've been using this on EMR for
> >> some of
> >> >> our JSON data....
> >> >>
> >> >> raw = load 'hdfs:///cleaned_logs/clicks2/$year_id/$month_id/part-*'
> >> USING
> >> >>
> >> >>
> >>
> JsonLoader('a:chararray,at:chararray,c1:(url:chararray,useragent:chararray,referrer:chararray,window:(innerheight:chararray,innerwidth:chararray,outerheight:chararray,outerwidth:chararray),resolution:(height:chararray,width:chararray)),cst:chararray,d:(a:chararray,b:chararray),i:chararray,id:chararray,ip:chararray,k:chararray,l:(lat:chararray,lng:chararray),p:chararray,pv:chararray,sa:chararray,sid:chararray,sst:chararray,t:chararray,uuid:chararray,v:chararray');
> >> >>
> >> >>
> >> >> Regards,
> >> >>
> >> >> Dano
> >> >>
> >> >>
> >> >>
> >> >> On Sat, Nov 17, 2012 at 3:09 PM, Russell Jurney <
> >> [EMAIL PROTECTED]
> >> >>> wrote:
> >> >>
> >> >>> I have some JSON data with a uniform schema. I want to load it in
> Pig.
> >> >>> JsonStorage doesn't work, because the data has no schema.
> >> >>>
> >> >>> How can I load JSON data in Pig?
> >> >>>
> >> >>> --
> >> >>> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
> >> >>> datasyndrome.com
> >> >>>
> >> >>
> >>
> >
> >
> > --
> > Sent from Gmail Mobile
> >
>

--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
+
Russell Jurney 2012-11-19, 19:27
+
Russell Jurney 2012-11-19, 19:30
+
Russell Jurney 2012-11-19, 19:33
+
Russell Jurney 2012-11-19, 19:35
+
Deepak Tiwari 2012-11-19, 20:22
+
Saxifrage Cucvara 2012-11-21, 05:56
+
David LaBarbera 2012-11-21, 14:25
+
Saxifrage Cucvara 2012-11-21, 22:36
+
Adam Kawa 2012-11-17, 23:40
+
Russell Jurney 2012-11-18, 22:46