Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> Handle NULL values in Cube dimensions


Copy link to this message
-
Handle NULL values in Cube dimensions
Hello everyone

I would like to bring up this discussion about the ways for handling NULL values in dimensions specified for cubing. For example, if we have a dimension color with following values

red
blue
null
green

how do we differentiate if the null value represent rollup of all colors values or actual null value?

SQL way:
There are 2 ways in which SQL server analysis services handles null values in dimensions
1) Throw error when it encounters null values in dimension values
2) Ignore error by adding the null values to UnknownMembers. By default UnknownMembers will be named as "Unknown". The name for UnknownMembers can also be specified by the user.

Do we need to handle both ways in Pig? I think the first way (throwing error) is pretty straightforward.
For the second way (ignoring error), what is the best way to provide support for user specified name for UnknownMembers?

Please share your thoughts about how we can handle this scenario for different datatypes in Pig.

Thanks
-- Prasanth

+
Dmitriy Ryaboy 2012-06-07, 00:41
+
Prasanth J 2012-06-08, 02:41
+
Alan Gates 2012-06-08, 16:22
+
Prasanth J 2012-06-09, 02:00
+
Jonathan Coveney 2012-06-09, 03:06
+
Prasanth J 2012-06-12, 05:53
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB