Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Is it necessary to set MD5 on rowkey?

Copy link to this message
Re: Is it necessary to set MD5 on rowkey?
On Wed, Dec 19, 2012 at 1:26 PM, David Arthur <[EMAIL PROTECTED]> wrote:

> Let's say you want to decompose a url into domain and path to include in
> your row key.
> You could of course just use the url as the key, but you will see
> hotspotting since most will start with "http".
Doesn't the original Bigtable paper [0] design around this problem by
dropping the protocol and only storing the domain? *goes to check* Yes, it

Personally, I've never encountered an HBase schema design problem where
salting really nailed it. It's an okay place to start with initial designs,
especially if you don't know your data well. I'm a big fan of using the
natural variance in the data itself to solve this problem. OpenTSDB does
this quite well, IMHO. Plus, it's kind of a game or data puzzle -- how to
use the data's nature to your advantage :)

Just my 2¢