Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Rowkey design and presplit table


Copy link to this message
-
Re: Rowkey design and presplit table
I would convert each id to long and then use Bytes.toBytes to convert this
long to a byte array. If it is an int then even better.
Now, write all 3 longs one after another to one array which will be your
rowkey.
This gives you:
* fixed size
* small row key - 3*8 bytes if you use long and 3*4 for int.

Why do you need to use prefix split policy?

On Monday, March 4, 2013, Lukáš Drbal wrote:

> Hi,
>
> i have one question about rowkey design and presplit table.
>
> My usecase:
> I need store a lot of comments where each comment are for one article and
> this article has one category.
>
> What i need:
> 1) read one comment by id (where i know commentId, articleId and
> categoryId)
> 2) read all coments for article (i know categoryId and articleId)
> 3) read all comments for category (i know categoryId)
>
> From this read pattern i see one good rowkey:
> <categoryId>_<articleId>_<commentId>
>
> But here i don't have fixed size of rowkey, so i don't know how to define
> split pattern. How can be this solved?
> This id's come from external system and grow very fast, so add some like
> "padding" for each part are hard.
>
> Maybe i can use hash function for each part
> md5(<categoryId>_md5(<articleId>)_md5(<commentId>), but this rowkey is very
> long (3*32+2 bytes), i don't have experience with this long rowkeys.
>
> Can someone give me a suggestions please?
>
> Regards
>
> Lukas Drbal
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB