Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Mapping nested json objects to map data type


Copy link to this message
-
Mapping nested json objects to map data type
Hi!

I am using Pig 0.10 version and I have a question about mapping nested JSON
objects from Hbase.

*For example: *

The below commands loads the field family from Hbase.

fields = load 'hbase://documents' using
org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:*','-loadKey true
-limit 5') as (rowkey, metadata:map[]);

The metadata field looks like below after the above command. ( I used
'illustrate fields' to get this)

{fields_j={"tika.Content-Encoding":"ISO-8859-1","distanceToCentroid":0.5761632290266712,"tika.Content-Type":"text/plain;
charset=ISO-8859-1","clusterId":118,"tika.parsing":"ok"}}

Map data type worked as I wanted so far. Now, I would like the value for
'fields_j' key to be also a Map data type. I think it is being assigned as
'byteArray' by default.

Is there any way by which I can convert this in to a map data type ? That
would be helpful for me to process more.

I tried to write python UDF but jython only supports python 2.5, I am not
sure how to convert this string in to a dictionary in python.

Did anyone encounter this type of issue before ?

Sorry for the long question, I want to explain my problem clearly.

Please let me know your suggestions.

Regards,

--
Kiran Chitturi

<http://www.linkedin.com/in/kiranchitturi>
+
Harsha 2013-03-14, 03:54
+
kiran chitturi 2013-03-14, 06:25
+
Harsha 2013-03-14, 06:51
+
kiran chitturi 2013-03-14, 15:08
+
kiran chitturi 2013-03-14, 04:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB