Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Region Splits


No.  The purpose of major compactions is to merge & dedupe within a region
boundary.  Compactions will not alter region boundaries, except in the
case of splits where a compaction is necessary to filter out any Rows from
the parent region that are no longer applicable to the daughter region.

On 11/22/11 9:04 AM, "Srikanth P. Shreenivas"
<[EMAIL PROTECTED]> wrote:

>Will major compactions take care of merging "older" regions or adding
>more key/values to them as number of regions grow?
>
>Regard,
>Srikanth
>
>-----Original Message-----
>From: Amandeep Khurana [mailto:[EMAIL PROTECTED]]
>Sent: Monday, November 21, 2011 7:25 AM
>To: [EMAIL PROTECTED]
>Subject: Re: Region Splits
>
>Mark,
>
>Yes, your understanding is correct. If your keys are sequential
>(timestamps
>etc), you will always be writing to the end of the table and "older"
>regions will not get any writes. This is one of the arguments against
>using
>sequential keys.
>
>-ak
>
>On Sun, Nov 20, 2011 at 11:33 AM, Mark <[EMAIL PROTECTED]> wrote:
>
>> Say we have a use case that has sequential row keys and we have rows
>> 0-100. Let's assume that 100 rows = the split size. Now when there is a
>> split it will split at the halfway mark so there will be two regions as
>> follows:
>>
>> Region1 [START-49]
>> Region2 [50-END]
>>
>> So now at this point all inserts will be writing to Region2 only
>>correct?
>> Now at some point Region2 will need to split and it will look like the
>> following before the split:
>>
>> Region1 [START-49]
>> Region2 [50-150]
>>
>> After the split it will look like:
>>
>> Region1 [START-49]
>> Region2 [50-100]
>> Region3 [150-END]
>>
>> And this pattern will continue correct? My question is when there is a
>>use
>> case that has sequential keys how would any of the older regions every
>> receive anymore writes? It seems like they would always be stuck at
>> MaxRegionSize/2. Can someone please confirm or clarify this issue?
>>
>> Thanks
>>
>>
>>
>>
>>
>
>________________________________
>
>http://www.mindtree.com/email/disclaimer.html
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB