Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Parsing flexing json in pig


Copy link to this message
-
Parsing flexing json in pig
jamal sasha 2013-10-07, 17:59
Hi,
  I have a semi-structured json:
For example:
{"id":1,"name":"foo"}
{"id":1,"name":"foo","address":"foobar"}
{"id":1,"name":"foo","address":"foobar","phone":[123,133}
{"id":2,"name":"foobar","address":"foobar"}

And so on.
So, what I want to do is , read this file

and select "id" and count "address" for each id
If "address" field is not there, then count it as 0
So, the output of above is:
id, count_address
1,2

Also, I want to use python udf to parse this json?