Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Is it necessary to set MD5 on rowkey?

Copy link to this message
Re: Is it necessary to set MD5 on rowkey?

There is middle term betwen sequecial keys (hot spoting risk) and md5
(heavy scan):
  * you can use composed keys with a field that can segregate data
(hostname, productname, metric name) like OpenTSDB
  * or use Salt with a limited number of values (example
substr(md5(rowid),0,1) = 16 values)
    so that a scan is a combination of 16 filters on on each salt values
    you can base your code on HBaseWD by sematext


2012/12/18 bigdata <[EMAIL PROTECTED]>

> Many articles tell me that MD5 rowkey or part of it is good method to
> balance the records stored in different parts. But If I want to search some
> sequential rowkey records, such as date as rowkey or partially. I can not
> use rowkey filter to scan a range of date value one time on the date by
> MD5. How to balance this issue?
> Thanks.
Damien HARDY