Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Casting unclean data.


Copy link to this message
-
Casting unclean data.
Hey there,

I've just started butting heads against a problem where I'm trying to
cast bytearrays in customer-provided data to integers. The overwhelming
majority of the time, we seem to get actual integers, but I just had a
job choke when one of these should-be-integers wasn't. Is there some
sort of "is a number" test that I could use to filter the data before
trying to cast it, or do I have to write a UDF or a little program to
stream the data through in order to get this sort of data cleaning.

Thanks,
Kris

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB