Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - hbase schema design


Copy link to this message
-
Re: hbase schema design
Ted Yu 2013-09-17, 16:53
I guess you were referring to section 6.3.2

bq. rowkey is stored and/ or read for every cell value

The above is true.

bq. the event description is a string of 0.1 to 2Kb

You can enable Data Block encoding to reduce storage.

Cheers

On Tue, Sep 17, 2013 at 9:44 AM, Adrian CAPDEFIER <[EMAIL PROTECTED]>wrote:

> Howdy all,
>
> I'm trying to use hbase for the first time (plenty of other experience with
> RDBMS database though), and I have a couple of questions after reading The
> Book.
>
> I am a bit confused by the advice to reduce "the row size" in the hbase
> book. It states that every cell value is accomplished by the coordinates
> (row, column and timestamp). I'm just trying to be thorough, so am I to
> understand that the rowkey is stored and/ or read for every cell value in a
> record or just once per column family in a record?
>
> I am intrigued by the rows as columns design as described in the book at
> http://hbase.apache.org/book.html#rowkey.design. To make a long story
> short, I will end up with a table to store event types and number of
> occurrences in each day. I would prefer to have the event description as
> the row key and the dates when it happened as columns - up to 7300 for
> roughly 20 years.
> However, the event description is a string of 0.1 to 2Kb and if it is
> stored for each cell value, I will need to use a surrogate (shorter) value.
>
> Is there a built-in functionality to generate (integer) surrogate values in
> hbase that can be used on the rowkey or does it need to be hand code it
> from scratch?
>