Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Possibility of using timestamp as row key in HBase


+
yun peng 2013-06-19, 20:04
Copy link to this message
-
Re: Possibility of using timestamp as row key in HBase
The new splitted region might be moved due to load balancing. Aren't you
experiencing the classic hot spotting? Only 1 RS getting all write traffic?
Just place a preceding byte before the time stamp and round robin each put
on values 1-num of region servers.

On Wednesday, June 19, 2013, yun peng wrote:

> Hi, All,
> Our use case requires to persist a stream into system like HBase. The
> stream data is in format of <timestamp, value>. In other word, timestamp is
> used as rowkey. We want to explore whether HBase is suitable for such kind
> of data.
>
> The problem is that the domain of row key (or timestamp) grow constantly.
> For example, given 3 nodes, n1 n2 n3, they are resp. hosting row key
> partition [0,4], [5, 9], [10,12]. Currently it is the last node n3 who is
> busy receiving upcoming writes (of row key 13 and 14). This continues until
> the region reaches max size 5 (that is, partition grows to [10,14]) and
> potentially splits.
>
> I am not expert on HBase split, but I am wondering after split, will the
> new writes still go to node n3 (for [10,14]) or the write stream can be
> intelligently redirected to other less busy node, like n1.
>
> In case HBase can't do things like this, how easy is it to extend HBase for
> such functionality? Thanks...
> Yun
>
+
yun peng 2013-06-19, 21:10
+
Asaf Mesika 2013-06-19, 21:26
+
yun peng 2013-06-19, 21:59
+
Asaf Mesika 2013-06-20, 13:32
+
yun peng 2013-06-20, 18:42
+
Asaf Mesika 2013-06-21, 05:26
+
yun peng 2013-06-21, 15:38
+
Anoop John 2013-06-21, 08:55
+
谢良 2013-06-20, 03:35
+
Bing Jiang 2013-06-20, 04:37
+
yun peng 2013-06-20, 17:45