Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> Handle NULL values in Cube dimensions


+
Prasanth J 2012-06-06, 21:24
+
Dmitriy Ryaboy 2012-06-07, 00:41
Copy link to this message
-
Re: Handle NULL values in Cube dimensions
But how do the user specify custom value for "*"? In the current implementation I am passing "NULL" string to the CubeDimensions constructor. If we need to get that value from user then we need some changes in grammar like

a = CUBE b BY (x,y,z) ALL as "AllProducts";

also what should be the default value if user didn't specify anything?

Thanks
-- Prasanth

On Jun 6, 2012, at 8:41 PM, Dmitriy Ryaboy wrote:

> Note that the current CubeDimensions UDF does a third thing -- instead
> of rebranding "nulls" as "unknown" and using null to mean "*" or "all
> values", the UDF allows you to specify a custom value to stand for "*"
> or "all values". That way null can be an individual valid cell value.
>
> This is (imho) much nicer than the other options, but it does
> complicate working with 3rd party tools.
>
> Does anyone have opinions on this?
>
> D
>
> On Wed, Jun 6, 2012 at 2:24 PM, Prasanth J <[EMAIL PROTECTED]> wrote:
>> Hello everyone
>>
>> I would like to bring up this discussion about the ways for handling NULL values in dimensions specified for cubing. For example, if we have a dimension color with following values
>>
>> red
>> blue
>> null
>> green
>>
>> how do we differentiate if the null value represent rollup of all colors values or actual null value?
>>
>> SQL way:
>> There are 2 ways in which SQL server analysis services handles null values in dimensions
>> 1) Throw error when it encounters null values in dimension values
>> 2) Ignore error by adding the null values to UnknownMembers. By default UnknownMembers will be named as "Unknown". The name for UnknownMembers can also be specified by the user.
>>
>> Do we need to handle both ways in Pig? I think the first way (throwing error) is pretty straightforward.
>> For the second way (ignoring error), what is the best way to provide support for user specified name for UnknownMembers?
>>
>> Please share your thoughts about how we can handle this scenario for different datatypes in Pig.
>>
>> Thanks
>> -- Prasanth
>>

+
Alan Gates 2012-06-08, 16:22
+
Prasanth J 2012-06-09, 02:00
+
Jonathan Coveney 2012-06-09, 03:06
+
Prasanth J 2012-06-12, 05:53