Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - HBase Types: Explicit Null Support


+
Nick Dimiduk 2013-04-01, 18:00
+
Doug Meil 2013-04-01, 18:41
+
Matt Corgan 2013-04-01, 19:26
+
Nick Dimiduk 2013-04-01, 20:32
+
James Taylor 2013-04-01, 23:31
+
Nick Dimiduk 2013-04-01, 23:41
+
Nick Dimiduk 2013-04-02, 02:26
+
Enis Söztutar 2013-04-02, 03:38
+
Matt Corgan 2013-04-02, 06:17
+
Michel Segel 2013-04-02, 02:40
+
James Taylor 2013-04-01, 23:49
Copy link to this message
-
Re: HBase Types: Explicit Null Support
Matt Corgan 2013-04-02, 00:07
I generally don't allow nulls in my composite row keys.  Does SQL allow
nulls in the PK?  In the rare case I wanted to do that I might create a
separate format called NullableCInt32 with 5 bytes where the first one
determined null.  It's important to keep the pure types pure.

I have lots of null *values* however, but they're represented by lack of a
qualifier in the Put.  If a row has all null values, I create a dummy
qualifier with a dummy value to make sure the row key gets inserted as it
would in sql.
On Mon, Apr 1, 2013 at 4:49 PM, James Taylor <[EMAIL PROTECTED]> wrote:

> On 04/01/2013 04:41 PM, Nick Dimiduk wrote:
>
>> On Mon, Apr 1, 2013 at 4:31 PM, James Taylor <[EMAIL PROTECTED]>
>> wrote:
>>
>>   From the SQL perspective, handling null is important.
>>>
>>
>>  From your perspective, it is critical to support NULLs, even at the
>> expense
>> of fixed-width encodings at all or supporting representation of a full
>> range of values. That is, you'd rather be able to represent NULL than
>> -2^31?
>>
> We've been able to get away with supporting NULL through the absence of
> the value rather than restricting the data range. We haven't had any push
> back on not allowing a fixed width nullable leading row key column. Since
> our variable length DECIMAL supports null and is a superset of the fixed
> width numeric types, users have a reasonable alternative.
>
> I'd rather not restrict the range of values, since it doesn't seem like
> this would be necessary.
>
>
>> On 04/01/2013 01:32 PM, Nick Dimiduk wrote:
>>
>>> Thanks for the thoughtful response (and code!).
>>>>
>>>> I'm thinking I will press forward with a base implementation that does
>>>> not
>>>> support nulls. The idea is to provide an extensible set of interfaces,
>>>> so
>>>> I
>>>> think this will not box us into a corner later. That is, a mirroring
>>>> package could be implemented that supports null values and accepts
>>>> the relevant trade-offs.
>>>>
>>>> Thanks,
>>>> Nick
>>>>
>>>> On Mon, Apr 1, 2013 at 12:26 PM, Matt Corgan <[EMAIL PROTECTED]>
>>>> wrote:
>>>>
>>>>   I spent some time this weekend extracting bits of our serialization
>>>> code
>>>>
>>>>> to
>>>>> a public github repo at http://github.com/hotpads/****data-tools<http://github.com/hotpads/**data-tools>
>>>>> <http://github.com/**hotpads/data-tools<http://github.com/hotpads/data-tools>
>>>>> >
>>>>> .
>>>>>    Contributions are welcome - i'm sure we all have this stuff laying
>>>>> around.
>>>>>
>>>>> You can see I've bumped into the NULL problem in a few places:
>>>>> *
>>>>>
>>>>> https://github.com/hotpads/****data-tools/blob/master/src/**<https://github.com/hotpads/**data-tools/blob/master/src/**>
>>>>> main/java/com/hotpads/data/****primitive/lists/LongArrayList.****java<
>>>>> https://github.com/**hotpads/data-tools/blob/**
>>>>> master/src/main/java/com/**hotpads/data/primitive/lists/**
>>>>> LongArrayList.java<https://github.com/hotpads/data-tools/blob/master/src/main/java/com/hotpads/data/primitive/lists/LongArrayList.java>
>>>>> >
>>>>> *
>>>>>
>>>>> https://github.com/hotpads/****data-tools/blob/master/src/**<https://github.com/hotpads/**data-tools/blob/master/src/**>
>>>>> main/java/com/hotpads/data/****types/floats/DoubleByteTool.****java<
>>>>> https://github.com/**hotpads/data-tools/blob/**
>>>>> master/src/main/java/com/**hotpads/data/types/floats/**
>>>>> DoubleByteTool.java<https://github.com/hotpads/data-tools/blob/master/src/main/java/com/hotpads/data/types/floats/DoubleByteTool.java>
>>>>> >
>>>>>
>>>>> Looking back, I think my latest opinion on the topic is to reject
>>>>> nullability as the rule since it can cause unexpected behavior and
>>>>> confusion.  It's cleaner to provide a wrapper class (so both
>>>>> LongArrayList
>>>>> plus NullableLongArrayList) that explicitly defines the behavior, and
>>>>> costs
>>>>> a little more in performance.  If the user can't find a pre-made
>>>>> wrapper
>>>>> class, it's not very difficult for each user to provide their own
+
Nick Dimiduk 2013-04-05, 00:34