Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Is it necessary to set MD5 on rowkey?


+
bigdata 2012-12-18, 09:20
+
Doug Meil 2012-12-18, 13:40
+
Damien Hardy 2012-12-18, 09:33
+
Michael Segel 2012-12-18, 13:52
+
bigdata 2012-12-18, 15:20
+
Alex Baranau 2012-12-18, 17:12
+
Michael Segel 2012-12-18, 17:24
+
Alex Baranau 2012-12-18, 17:36
+
Michael Segel 2012-12-18, 23:29
+
lars hofhansl 2012-12-19, 18:37
+
Michael Segel 2012-12-19, 19:46
+
lars hofhansl 2012-12-19, 20:51
+
Michael Segel 2012-12-19, 21:02
+
David Arthur 2012-12-19, 21:26
Copy link to this message
-
Re: Is it necessary to set MD5 on rowkey?
Nick Dimiduk 2012-12-19, 22:15
On Wed, Dec 19, 2012 at 1:26 PM, David Arthur <[EMAIL PROTECTED]> wrote:

> Let's say you want to decompose a url into domain and path to include in
> your row key.
>
> You could of course just use the url as the key, but you will see
> hotspotting since most will start with "http".
Doesn't the original Bigtable paper [0] design around this problem by
dropping the protocol and only storing the domain? *goes to check* Yes, it
does.

Personally, I've never encountered an HBase schema design problem where
salting really nailed it. It's an okay place to start with initial designs,
especially if you don't know your data well. I'm a big fan of using the
natural variance in the data itself to solve this problem. OpenTSDB does
this quite well, IMHO. Plus, it's kind of a game or data puzzle -- how to
use the data's nature to your advantage :)

Just my 2¢
-n

[0]:
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/bigtable-osdi06.pdf
+
Andrew Purtell 2012-12-19, 22:28
+
David Arthur 2012-12-19, 23:04
+
Alex Baranau 2012-12-19, 23:07
+
Michael Segel 2012-12-20, 01:09
+
Michael Segel 2012-12-20, 01:02
+
Jean-Marc Spaggiari 2012-12-20, 01:11
+
Michael Segel 2012-12-20, 01:23
+
Jean-Marc Spaggiari 2012-12-20, 01:35
+
Michel Segel 2012-12-20, 01:47
+
lars hofhansl 2012-12-20, 02:06
+
Michael Segel 2012-12-20, 13:20
+
Nick Dimiduk 2012-12-20, 18:15
+
Michael Segel 2012-12-20, 20:15
+
k8 robot 2013-02-06, 01:46