Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Pre-split table using shell


Copy link to this message
-
Re: Pre-split table using shell
Michael Segel 2012-06-12, 08:23
UUIDs are unique but not necessarily random and even in random samplings, you may not see an even distribution except over time.
Sent from my iPhone

On Jun 12, 2012, at 3:18 AM, "Simon Kelly" <[EMAIL PROTECTED]> wrote:

> Hi
>
> I'm getting some unexpected results with a pre-split table where some of
> the regions are not getting any data.
>
> The table keys are UUID (generated using Java's UUID.randomUUID() ) which
> I'm storing as a byte[16]:
>
>    key[0-7] = uuid most significant bits
>    key[8-15] = uuid least significant bits
>
> The table is created via the shell as follows:
>
>    create 'table', {NAME => 'cf'}, {SPLITS_FILE => 'splits.txt'}
>
> The splits.txt is generated using the code here:
> http://pastebin.com/DAExXMDz which generates 32 regions split between x00
> and xFF. I have also tried with 16 byte regions keys (x00x00... to
> xFFxFF...).
>
> As far as I understand this should distribute the rows evenly across the
> regions but I'm getting a bunch of regions with no rows. I'm also confused
> as the the ordering of the regions since it seems the start and end keys
> aren't really matching up correctly. You can see the regions and the
> requests they are getting here: http://pastebin.com/B4771g5X
>
> Thanks in advance for the help.
> Simon