Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - parsing data


Copy link to this message
-
parsing data
Keren Ouaknine 2014-01-16, 04:01
Hi,

I would like to parse data from the following format:
keren^A1^A2^Aqt^A3^A4^A5.0
to the following format:

{"user":"keren", "action": 1, "timespent": 2, "query_term":"qt", "ip_addr":
3, "timestamp": 4, "estimated_revenue": 5.0 }

[I also happen to have a map and a bag of maps but for the sake of
simplicity I didn't add them to the example]

Should I use PigPerformanceLoader or a Pig script for the above parsing?
Modifying PigPerformanceLoader seems like a low-hanging fruit though I
might have to do it several changes and modifying a Pig script seems a more
elegant solution (just not sure how).

Thanks,
Keren

--
Keren Ouaknine
www.kereno.com