Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> parsing data


I would like to parse data from the following format:
to the following format:

{"user":"keren", "action": 1, "timespent": 2, "query_term":"qt", "ip_addr":
3, "timestamp": 4, "estimated_revenue": 5.0 }

[I also happen to have a map and a bag of maps but for the sake of
simplicity I didn't add them to the example]

Should I use PigPerformanceLoader or a Pig script for the above parsing?
Modifying PigPerformanceLoader seems like a low-hanging fruit though I
might have to do it several changes and modifying a Pig script seems a more
elegant solution (just not sure how).


Keren Ouaknine