1.When I don't supply SPLITS at table creation , all put operation will go to one region only. But when region grows more than hbase.hregion.max.filesize , then 2 regions will be created both have half-half data or another will be empty initially? 2.If both have 50-50% data and row key is monotonically increasing then 1 region will be half filled always and will never be filled again ? 3.While prespliting table only way is to specify row boundaries and key prefixes ?Say if i don't know key ranges , as in my case its GUID hexadecimal 32 character string , what should be region split boundary ? and How many splits should be created - is it equal to no of regionserver aka datanodes ? 4.For keys of type ACTIVITYTYPE-DATE (where activity type has 2 values 1.login 2.logout) what should be split strategy ?
On Tue, Jul 15, 2014 at 7:03 PM, Ted Yu <[EMAIL PROTECTED]> wrote: