Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase Types: Explicit Null Support

Copy link to this message
HBase Types: Explicit Null Support

Thinking about data types and serialization. I think null support is an
important characteristic for the serialized representations, especially
when considering the compound type. However, doing so in directly
incompatible with fixed-width representations for numerics. For instance,
if we want to have a fixed-width signed long stored on 8-bytes, where do
you put null? float and double types can cheat a little by folding negative
and positive NaN's into a single representation (this isn't strictly
correct!), leaving a place to represent null. In the long example case, the
obvious choice is to reduce MAX_VALUE or increase MIN_VALUE by one. This
will allocate an additional encoding which can be used for null. My
experience working with scientific data, however, makes me wince at the

The variable-width encodings have it a little easier. There's already
enough going on that it's simpler to make room.

Remember, the final goal is to support order-preserving serialization. This
imposes some limitations on our encoding strategies. For instance, it's not
enough to simply encode null, it really needs to be encoded as 0x00 so as
to sort lexicographically earlier than any other value.

What do you think? Any ideas, experiences, etc?

Doug Meil 2013-04-01, 18:41
Matt Corgan 2013-04-01, 19:26
Nick Dimiduk 2013-04-01, 20:32
James Taylor 2013-04-01, 23:31
Nick Dimiduk 2013-04-01, 23:41
Nick Dimiduk 2013-04-02, 02:26
Enis Söztutar 2013-04-02, 03:38
Matt Corgan 2013-04-02, 06:17
Michel Segel 2013-04-02, 02:40
James Taylor 2013-04-01, 23:49
Matt Corgan 2013-04-02, 00:07
Nick Dimiduk 2013-04-05, 00:34