Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Pre splitting the HBase Table for specific row key design


Copy link to this message
-
Re: Pre splitting the HBase Table for specific row key design
Jean-Marc Spaggiari 2013-12-27, 16:30
Hi Hari,

Can you please provide more details on the the challenge that you are
facing?

You can pre-split using the Java Client Api, the HBase shell or even with
the WebUI.

For the shell, you can do something like this: create 'transactions', 'f1',
{NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
JM
2013/12/27 Hari Krishna <[EMAIL PROTECTED]>

> Hi,
>
> We are planning to migrate form CDH3 cluster to CDH4 cluster and as part of
> migration we are also planning to use HBase instead of Hive ware house that
> we are using in CDH3 cluster. Daily we are bringing the data from oracle to
> hadoop using sqooping and we are having 10 different data base schema from
> where we are bringing.
>
> In hive ware house we have maintained a table with schema name as higher
> level partition and date as other partition in side schema partition. Every
> day the  data for the table will be kept on date partition.
>
> In HBase we have designed a table to have a row key as combination of (byte
> array value of Bucket Number(value ranges from 0 to 15, so total of 16
> buckets we are maintaining), MD5(of schema), MD5(date), byte array value of
> pkid). It is working as expected, we are able to retrieve the data based on
> schema and date wise, which is our key use case. Here each bucket having a
> key of ranges 0 to long max.
>
> Now we are having a challenge in pre-splitting the table (lets say table
> name as transactions). Can any one help me on this.
>
> Regards,
> GHK.
>