Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Storing dates in LzoJsonStorage

Copy link to this message
Storing dates in LzoJsonStorage

I'm working on a pig script (and some associated UDFs) to do a little
cleaning on some data, stored as JSON objects, being used by other scripts
down the pipe. The JSON objects contain dates, such as:

        "end_time" : ISODate("2012-11-07T00:29:58.728Z"),

which I've noticed that I can read as end_time#'$date' and get a
milliseconds since epoch long.

In order not to cause problems with the scripts down the pipe, I'd like
to be able to store the dates back into JSON objects using
LzoJsonStorage, so that the existing scripts can just be repointed at
the cleaned data, and otherwise continue working without any rewrites.

The last line in the pig script doing this cleaning is a UDF call that
generates the map expected by LzoJsonStorage, so if there's some
trickery that can be done with Java objects which don't correspond to
pig data types (e.g. Date), I'd be up for giving that a go.

Can this be done? If so, is there a Right Way to do it?


Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3