Weishung Chung 2011-01-10, 15:33
Friso van Vollenhoven 2011-01-10, 15:50
Chirstopher Tarnas 2011-01-10, 16:05
Matt Corgan 2011-01-10, 16:08
Thank you for the replies.
Most of the queries, (70%) will be for scanning a range of consecutive
times, with some single timestamp query (30%)
But there are multiple tables with the same range of timestamps, will all
these same range of timestamps from multiple tables be stored on the same
region server and if so, could it affect the performance of map reduce jobs
(operated on those tables with the same range of time periods) ? Would
hotspotting defeat the purpose of map reduce?
On Mon, Jan 10, 2011 at 10:08 AM, Matt Corgan <[EMAIL PROTECTED]> wrote:
> You can also add a random (or hashed) prefix to the beginning of the key.
> If your prefix were one byte with values 0-63, you've divided the hot spot
> into 64 smaller ones, which is better for writing. The downside is that if
> you want to read a range of values, you will have to query all 64 "shards"
> and merge the sorted values. You can choose whatever prefix size is best
> for your scenario.
> On Mon, Jan 10, 2011 at 11:05 AM, Chirstopher Tarnas <[EMAIL PROTECTED]>
> > Some options that I am aware of:
> > reverse the byte order of the timestamp
> > use UUIDs rather than a timestamp
> > use hashing, this working really depends on your requirements
> > On Mon, Jan 10, 2011 at 9:33 AM, Weishung Chung <[EMAIL PROTECTED]>
> > wrote:
> > > What is the good way to randomize the primary key which is a timestamp
> > > HBase to avoid hotspotting?
> > > Thank you so much :)
> > >
Ted Dunning 2011-01-10, 16:30
Matt Corgan 2011-01-10, 16:41
Weishung Chung 2011-01-10, 16:56
Matt Corgan 2011-01-10, 17:04
Weishung Chung 2011-01-10, 17:42
Tost 2011-01-11, 00:18