Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Casting unclean data.

Copy link to this message
Casting unclean data.
Hey there,

I've just started butting heads against a problem where I'm trying to
cast bytearrays in customer-provided data to integers. The overwhelming
majority of the time, we seem to get actual integers, but I just had a
job choke when one of these should-be-integers wasn't. Is there some
sort of "is a number" test that I could use to filter the data before
trying to cast it, or do I have to write a UDF or a little program to
stream the data through in order to get this sort of data cleaning.


Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3