Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Mapping nested json objects to map data type


Copy link to this message
-
Mapping nested json objects to map data type
Hi!

I am using Pig 0.10 version and I have a question about mapping nested JSON
objects from Hbase.

*For example: *

The below commands loads the field family from Hbase.

fields = load 'hbase://documents' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:*','-loadKey true
-limit 5') as (rowkey, metadata:map[]);

The metadata field looks like below after the above command. ( I used
'illustrate fields' to get this)

{fields_j={"tika.Content-Encoding":"ISO-8859-1","distanceToCentroid":0.5761632290266712,"tika.Content-Type":"text/plain;
charset=ISO-8859-1","clusterId":118,"tika.parsing":"ok"}}

Map data type worked as I wanted so far. Now, I would like the value for
'fields_j' key to be also a Map data type. I think it is being assigned as
'byteArray' by default.

Is there any way by which I can convert this in to a map data type ? That
would be helpful for me to process more.

I tried to write python UDF but jython only supports python 2.5, I am not
sure how to convert this string in to a dictionary in python.

Did anyone encounter this type of issue before ?

Sorry for the long question, I want to explain my problem clearly.

Please let me know your suggestions.

Regards,

--
Kiran Chitturi

<http://www.linkedin.com/in/kiranchitturi>
+
Harsha 2013-03-14, 03:54
+
kiran chitturi 2013-03-14, 06:25
+
Harsha 2013-03-14, 06:51
+
kiran chitturi 2013-03-14, 15:08
+
kiran chitturi 2013-03-14, 04:40