Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - CSV format loader


Copy link to this message
-
CSV format loader
James Kebinger 2009-12-08, 23:12
Hi all, I realized a week or two ago that PigStorage(',') wasn't adequate to
parse files that had commas embedded in properly CSV quoted fields.

I went ahead and built a CSV parser for pig 0.3 that deals with embedded
quotes (but not embedded newlines). Its up on github:
http://github.com/jkebinger/pig-user-defined-functions/tree/master/src/com/kebinger/pig/storage/

What I want to know is - is there interest in having this in the PiggyBank?
I'm happy to upgrade it to be compatible w/ the current pig version and
write some tests if there's interest.

thanks

-james