Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Key formats and very low cardinality leading fields


Copy link to this message
-
Re: Key formats and very low cardinality leading fields
Hi Eric,

In HBase, data is stored sequentially based on the key alphabetical order.

It will depend of the number of reqions and regionservers you have but
if you write data from 23AAAAAA to 23ZZZZZZ they will most probably go
to the same region even if the cardinality of the 2nd part of the key
is high.

If the first number is always changing between 1 and 30 for each
write, then you will reach multiple region/servers if you have, else,
you might have some hot-stopping.

JM

2012/9/3, Eric Czech <[EMAIL PROTECTED]>:
> Hi everyone,
>
> I was curious whether or not I should expect any write hot spots if I
> structured my composite keys in a way such that the first field is a
> low cardinality (maybe 30 distinct values) value and the next field
> contains a very high cardinality value that would not be written
> sequentially.
>
> More concisely, I want to do this:
>
> Given one number between 1 and 30, write many millions of rows with
> keys like <number chosen> : <some generally distinct, non-sequential
> value>
>
> Would there be any problem with the millions of writes happening with
> the same first field key prefix even if the second field is largely
> unique?
>
> Thank you!
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB