Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Replace null with string


Copy link to this message
-
Re: Replace null with string
Nonulls = foreach somenulls generate
  (field == null ? 'other' : field) as field;

On Jun 7, 2012, at 4:37 AM, Mario Lassnig <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I'm having a lot of null entries in my data. Due to later processing it would be very helpful if I could set a default value for null to be the string "other". I couldn't find a way to do this (version 0.8.1-cdh3u4)
>
> Also, I have some variables in my GENERATE statements that can potentially return null, and I would need something similar to the SQL DECODE function to get the "other" string instead of null.
>
> Example:
>
> tmp = FOREACH dump GENERATE site, REGEX_EXTRACT(name, '^(?:([^.]+)\\.?){1}', 1) AS project, ((ami MATCHES '.*datatype.*') ? REGEX_EXTRACT(name, '^(?:([^.]+)\\.?){5}', 1) : 'other') AS datatype, ami, duid, nbfiles, length, rnbfiles, rlength, name;
>
> Here: 'site' and 'datatype' could return an empty string (which is valid) and is interpreted as null, but should be "other" instead.
>
> Thanks a lot,
> Mario