Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Reading json file.


Copy link to this message
-
Re: Reading json file.
Hi Gerrit. Can you please help me with an example?

On Fri, Aug 30, 2013 at 8:48 AM, Gerrit Jansen van Vuuren <
[EMAIL PROTECTED]> wrote:

> These are converted to tuples, I could've made them bags but thought tuples
> repesent arrays more generically.
>  On 30 Aug 2013 17:36, "Zhu Wayne" <[EMAIL PROTECTED]> wrote:
>
> > how do you deal with JSON array or list of elements?
> >
> >
> > On Fri, Aug 30, 2013 at 9:45 AM, Gerrit Jansen van Vuuren <
> > [EMAIL PROTECTED]> wrote:
> >
> > > try using my UDF:
> > > https://github.com/gerritjvv/pigutils/tree/master/pigudfs/udfs
> > >
> > > it loads json as a map, and all nested objects as maps. I've not
> released
> > > the jar yet so you'll need to compile a jar from source.
> > >
> > > here's an example of how to use it:
> > >
> > > l = load 'myfile.json.gz' using org.nts.pigutils.udfs.JSONLoader();
> > > r = foreach l generate m#'user'#'age', m#'name';
> > >
> > >
> > >
> > > On Thu, Aug 29, 2013 at 3:19 PM, jamal sasha <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I have json file in follwoing format:
> > > > { "_id" : "foo.com", "categories" : [], "h1" : { "bar==" : {
> "first" :
> > > > 1281916800, "last" : 1316995200 }, "foo==" : { "first" : 1281916800,
> > > "last"
> > > > : 1316995200 } }, "name2" : [ "foobarl.com", "foobar2.com" ], "rep"
> :
> > > > null }
> > > > So, how do i parse this json in pig..
> > > >
> > > > also, the categories and rep can have some char in it..and might not
> be
> > > > always empty.
> > > >
> > > > Thanks
> > > >
> > > >
> > > >
> > > > This message may contain confidential and/or privileged information.
> If
> > > it
> > > > has
> > > > been sent to you in error, please reply to advise the sender of the
> > error
> > > > and
> > > > then immediately delete this message.
> > >
> >
> >
> >
> > --
> > Wayne Zhu
> > 847-282-0596 (Google Voice)
> >
> >
> >
> > This message may contain confidential and/or privileged information. If
> it
> > has
> > been sent to you in error, please reply to advise the sender of the error
> > and
> > then immediately delete this message.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB