Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Best way to compact a region after a move?


Copy link to this message
-
Re: Best way to compact a region after a move?
Jean-Marc Spaggiari 2012-12-30, 18:38
Exactly what I was looking for ;)

Thanks a lot!

JM

2012/12/30, Ted Yu <[EMAIL PROTECTED]>:
> I guess you would want custom compaction only on user tables.
> Take a look at the following config param in
> http://hbase.apache.org/book.html:
> hbase.coprocessor.region.classesCheers
>
> On Sun, Dec 30, 2012 at 10:25 AM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
>> Thanks for the hints. I will look there too.
>>
>> Is there a way to attach id to ALL the tables and not specificly some
>> tables? Or should I attached it to the tables one by one?
>>
>> 2012/12/30, Ted Yu <[EMAIL PROTECTED]>:
>> > You can find how to dynamically load coprocessor in
>> > hbase-server/src/main/ruby/shell/commands/alter.rb
>> >
>> > There're ample test cases which show you how to use RegionObserver,
>> > e.g.
>> >
>> src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
>> >
>> > Yes, you can attach your coprocessor to the table(s) which you want the
>> > custom compaction. The coprocessor would be deployed on region servers.
>> >
>> > Cheers
>> >
>> > On Sun, Dec 30, 2012 at 9:47 AM, Jean-Marc Spaggiari <
>> > [EMAIL PROTECTED]> wrote:
>> >
>> >> Hi Ted,
>> >>
>> >> Thanks for your reply.
>> >>
>> >> I looked at the RegionObserver and I will dig this way.  I think I
>> >> found what I need in it.
>> >>
>> >> How can I attach it to HBase? Should I do that on all the servers? On
>> >> the master only and it will replicate? Should I attached it to each
>> >> regions? Or directly to the table?
>> >>
>> >> Thanks,
>> >>
>> >> JM
>> >>
>> >>
>> >>
>> >> 2012/12/30, Ted Yu <[EMAIL PROTECTED]>:
>> >> > balancerCluster() executes on master. Compaction is region server
>> >> activity.
>> >> > So they don't pair naturally.
>> >> >
>> >> > I answered first part of the question in the thread titled 'How to
>> know
>> >> > it's time for a major compaction?':
>> >> >
>> >> > In RegionObserver, we already have the following hook:
>> >> >
>> >> >   /**
>> >> >    * Called after the region is reported as open to the master.
>> >> >    * @param c the environment provided by the region server
>> >> >    */
>> >> >   void postOpen(final ObserverContext<
>> >> > RegionCoprocessorEnvironment> c);
>> >> >
>> >> > Auto-compaction logic can be triggered through the above hook.
>> >> >
>> >> > Take a look at the following hook for the second part of your
>> question:
>> >> >
>> >> >   void postCompact(final
>> >> > ObserverContext<RegionCoprocessorEnvironment>
>> >> > c,
>> >> > final HStore store,
>> >> >       StoreFile resultFile) throws IOException;
>> >> >
>> >> > Cheers
>> >> >
>> >> > On Sun, Dec 30, 2012 at 8:25 AM, Jean-Marc Spaggiari <
>> >> > [EMAIL PROTECTED]> wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> When I'm balancing manually the regions on my cluster, and I want
>> >> >> to
>> >> >> make sure they are local, so I want to major_compact them each time
>> >> >> I'm moving them.
>> >> >>
>> >> >> On the balanceCluster method, we are returning a list of region to
>> >> >> move. Which mean they are not yet moved, so I can't compact them
>> >> >> there.
>> >> >>
>> >> >> Is there a place where I shoud hook to compact those regions?
>> >> >>
>> >> >> So far, the only idea I found was to start a thread on the
>> >> >> balancerCluster, wait 1 minute, and compact all the regions I
>> >> >> returned. But I'm wondering if there is a better way to achieve
>> >> >> that?
>> >> >> Is there a queue where I should place those regions to compact
>> >> >> instead? Also, I need to know (even if it's just in the logs) when
>> >> >> those compactions are done.
>> >> >>
>> >> >> Thanks,
>> >> >>
>> >> >> JM
>> >> >>
>> >> >
>> >>
>> >
>>
>