Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> question about pre-splitting regions


Copy link to this message
-
question about pre-splitting regions
Hi,

I am creating a new table and want to pre-split the regions and am seeing
some weird behavior.

My table is designed as a composite of multiple fixed length byte arrays
separated by a control character (for simplicity sake we can say the
separator is _underscore_). The prefix of this rowkey is deterministic
(i.e. length of 8 bytes) and I know it beforehand how many different prefix
I will see in the near future. The values after the prefix is not
deterministic. I wanted to create a pre-split tables based on the number of
number of prefix combinations that I know.

I ended up doing something like this:
hbaseAdmin.createTable(tableName, Bytes.toBytes(1L),
Bytes.toBytes(maxCombinationPrefixValue), maxCombinationPrefixValue)

The create table worked fine and as expected it created the number of
partitions. But when I write data to the table, I still see all the writes
hitting a single region instead of hitting different regions based on the
prefix. Is my thinking of splitting by prefix values flawed ? Do I have to
split by some real rowkeys (though it's impossible for me to know what
rowkeys will show up except the row prefix which is much more
deterministic).

For some reason I think I have a flawed understanding of the createTable
API and that is causing the issue for me ? Should I use the byte[][]
prefixes method and not the one that I am using right now ?

Any suggestions/pointers ?

Thanks,
Viral
+
Viral Bajaria 2013-02-15, 04:08