Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Mapping nested json objects to map data type


Copy link to this message
-
Re: Mapping nested json objects to map data type
Hi Kiran,
      Can you take a look at pig scripts under here

https://github.com/mozilla-metrics/telemetry-toolbox/tree/master/src/main/pig
All of them uses those Json udfs to parse.
--
Harsha
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On Wednesday, March 13, 2013 at 11:25 PM, kiran chitturi wrote:

> Hi Harsha,
>
> I am using the UDF that was in the link
> https://github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/pig/eval/json/MapToJson.java
> .
>
> I was able to run it successfully but I had some issues since the output is
> null.
>
> Please find my commands below
>
> ----------
> fields = load 'hbase://documents' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:*','-loadKey true
> -limit 5') as (rowkey, metadata:map[]);
> fields_split = foreach fields generate
> com.mozilla.pig.eval.json.MapToJson(metadata);
> dump fields_split;
> -----------
>
> The output is empty 51 records. When I used the command 'illustrate
> fields_split', It gave me the below output.
>
> -------------------------------------------------------
> | fields | rowkey:bytearray
> | metadata:map
>
> |
> ------------------------------------------------------
> | |
> collection100hdfs://LucidN1:50001/input/reuters/reut2-021.sgm-166.txt |
> {fields_j={"tika.Content-Encoding":"ISO-8859-1","distanceToCentroid":0.5685425678289969,"tika.Content-Type":"text/plain;
> charset=ISO-8859-1","clusterId":118,"tika.parsing":"ok"}} |
> ------------------------------------
> | fields_split | :chararray |
> ------------------------------------
> | | |
> ------------------------------------
>
> Am I missing something here ? Can you give me a simple working usecase of
> yours if you don't mind ? All of my records have something in the 'fields'
> family. It is quite strange to see empty results.
>
> Please let me know your suggestions.
>
> Thank you,
>
>
> On Wed, Mar 13, 2013 at 11:54 PM, Harsha <[EMAIL PROTECTED]> wrote:
>
> > Hi Kiran,
> > If you are ok with using java for udfs take a look at this
> >
> > https://github.com/mozilla-metrics/akela/tree/master/src/main/java/com/mozilla/pig/eval/json
> > we Use MapToJson to parse complex json objects from hbase.
> > -Harsha
> >
> >
> > --
> > Harsha
> >
> >
> > On Wednesday, March 13, 2013 at 8:37 PM, kiran chitturi wrote:
> >
> > > Hi!
> > >
> > > I am using Pig 0.10 version and I have a question about mapping nested
> > JSON
> > > objects from Hbase.
> > >
> > > *For example: *
> > >
> > > The below commands loads the field family from Hbase.
> > >
> > > fields = load 'hbase://documents' using
> > > org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:*','-loadKey true
> > > -limit 5') as (rowkey, metadata:map[]);
> > >
> > > The metadata field looks like below after the above command. ( I used
> > > 'illustrate fields' to get this)
> > >
> >
> > {fields_j={"tika.Content-Encoding":"ISO-8859-1","distanceToCentroid":0.5761632290266712,"tika.Content-Type":"text/plain;
> > > charset=ISO-8859-1","clusterId":118,"tika.parsing":"ok"}}
> > >
> > > Map data type worked as I wanted so far. Now, I would like the value for
> > > 'fields_j' key to be also a Map data type. I think it is being assigned
> > >
> >
> > as
> > > 'byteArray' by default.
> > >
> > > Is there any way by which I can convert this in to a map data type ? That
> > > would be helpful for me to process more.
> > >
> > > I tried to write python UDF but jython only supports python 2.5, I am not
> > > sure how to convert this string in to a dictionary in python.
> > >
> > > Did anyone encounter this type of issue before ?
> > >
> > > Sorry for the long question, I want to explain my problem clearly.
> > >
> > > Please let me know your suggestions.
> > >
> > > Regards,
> > >
> > > --
> > > Kiran Chitturi
> > >
> > > <http://www.linkedin.com/in/kiranchitturi>
>
>
> --
> Kiran Chitturi
>
> <http://www.linkedin.com/in/kiranchitturi>