Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Replace null with string


Copy link to this message
-
Re: Replace null with string
Nonulls = foreach somenulls generate
  (field == null ? 'other' : field) as field;

On Jun 7, 2012, at 4:37 AM, Mario Lassnig <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I'm having a lot of null entries in my data. Due to later processing it would be very helpful if I could set a default value for null to be the string "other". I couldn't find a way to do this (version 0.8.1-cdh3u4)
>
> Also, I have some variables in my GENERATE statements that can potentially return null, and I would need something similar to the SQL DECODE function to get the "other" string instead of null.
>
> Example:
>
> tmp = FOREACH dump GENERATE site, REGEX_EXTRACT(name, '^(?:([^.]+)\\.?){1}', 1) AS project, ((ami MATCHES '.*datatype.*') ? REGEX_EXTRACT(name, '^(?:([^.]+)\\.?){5}', 1) : 'other') AS datatype, ami, duid, nbfiles, length, rnbfiles, rlength, name;
>
> Here: 'site' and 'datatype' could return an empty string (which is valid) and is interpreted as null, but should be "other" instead.
>
> Thanks a lot,
> Mario
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB