Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Mapping nested json objects to map data type

Copy link to this message
Re: Mapping nested json objects to map data type
Hi Kiran,
       If you are ok with using java for udfs take a look at this
we Use MapToJson to parse complex json objects from hbase.
On Wednesday, March 13, 2013 at 8:37 PM, kiran chitturi wrote:

> Hi!
> I am using Pig 0.10 version and I have a question about mapping nested JSON
> objects from Hbase.
> *For example: *
> The below commands loads the field family from Hbase.
> fields = load 'hbase://documents' using
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:*','-loadKey true
> -limit 5') as (rowkey, metadata:map[]);
> The metadata field looks like below after the above command. ( I used
> 'illustrate fields' to get this)
> {fields_j={"tika.Content-Encoding":"ISO-8859-1","distanceToCentroid":0.5761632290266712,"tika.Content-Type":"text/plain;
> charset=ISO-8859-1","clusterId":118,"tika.parsing":"ok"}}
> Map data type worked as I wanted so far. Now, I would like the value for
> 'fields_j' key to be also a Map data type. I think it is being assigned as
> 'byteArray' by default.
> Is there any way by which I can convert this in to a map data type ? That
> would be helpful for me to process more.
> I tried to write python UDF but jython only supports python 2.5, I am not
> sure how to convert this string in to a dictionary in python.
> Did anyone encounter this type of issue before ?
> Sorry for the long question, I want to explain my problem clearly.
> Please let me know your suggestions.
> Regards,
> --
> Kiran Chitturi
> <http://www.linkedin.com/in/kiranchitturi>