Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Best way to compact a region after a move?


Copy link to this message
-
Re: Best way to compact a region after a move?
Jean-Marc Spaggiari 2012-12-30, 17:47
Hi Ted,

Thanks for your reply.

I looked at the RegionObserver and I will dig this way.  I think I
found what I need in it.

How can I attach it to HBase? Should I do that on all the servers? On
the master only and it will replicate? Should I attached it to each
regions? Or directly to the table?

Thanks,

JM

2012/12/30, Ted Yu <[EMAIL PROTECTED]>:
> balancerCluster() executes on master. Compaction is region server activity.
> So they don't pair naturally.
>
> I answered first part of the question in the thread titled 'How to know
> it's time for a major compaction?':
>
> In RegionObserver, we already have the following hook:
>
>   /**
>    * Called after the region is reported as open to the master.
>    * @param c the environment provided by the region server
>    */
>   void postOpen(final ObserverContext<
> RegionCoprocessorEnvironment> c);
>
> Auto-compaction logic can be triggered through the above hook.
>
> Take a look at the following hook for the second part of your question:
>
>   void postCompact(final ObserverContext<RegionCoprocessorEnvironment> c,
> final HStore store,
>       StoreFile resultFile) throws IOException;
>
> Cheers
>
> On Sun, Dec 30, 2012 at 8:25 AM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> When I'm balancing manually the regions on my cluster, and I want to
>> make sure they are local, so I want to major_compact them each time
>> I'm moving them.
>>
>> On the balanceCluster method, we are returning a list of region to
>> move. Which mean they are not yet moved, so I can't compact them
>> there.
>>
>> Is there a place where I shoud hook to compact those regions?
>>
>> So far, the only idea I found was to start a thread on the
>> balancerCluster, wait 1 minute, and compact all the regions I
>> returned. But I'm wondering if there is a better way to achieve that?
>> Is there a queue where I should place those regions to compact
>> instead? Also, I need to know (even if it's just in the logs) when
>> those compactions are done.
>>
>> Thanks,
>>
>> JM
>>
>