Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - difference between major and minor compactions?


+
yun peng 2013-06-22, 12:05
+
Jean-Marc Spaggiari 2013-06-22, 12:51
+
yun peng 2013-06-22, 13:12
Copy link to this message
-
Re: difference between major and minor compactions?
Jean-Marc Spaggiari 2013-06-22, 13:24
Hi Yun,

There is more differences.

The minor compactions are not remove the delete flags and the deleted
cells. It only merge the small files into a bigger one. Only the major
compaction (in 0.94) will deal with the delete cells. There is also
some more compaction mechanism coming in trunk with nice features.

Look at: https://issues.apache.org/jira/browse/HBASE-7902
https://issues.apache.org/jira/browse/HBASE-7680
https://issues.apache.org/jira/browse/HBASE-7680

Minor compactions are promoted to major compactions when the
compaction policy decide to compact all the files. If all the files
need to be merged, then we can run a major compaction which will do
the same thing as the minor one, but with the bonus of deleting the
required marked cells.

JM

2013/6/22 yun peng <[EMAIL PROTECTED]>:
> Thanks, JM
> It seems like the sole difference btwn major and minor compaction is the
> number of files (to be all or just a subset of storefiles). It mentioned
> very briefly in
> http://hbase.apache.org/book<http://hbase.apache.org/book/regions.arch.html>that
> "Sometimes a minor compaction will ... promote itself to being a major
> compaction". What does "sometime" exactly mean here? or any policy in HBase
> that allow application to specify when to promote a minor compaction to be
> a major (like user or some monitoring service can specify now is offpeak
> time?)
> Yun
>
>
>
> On Sat, Jun 22, 2013 at 8:51 AM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
>> Hi Yun,
>>
>> Few links:
>> - http://blog.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/
>> => There is a small paragraph about compactions which explain when
>> they are triggered.
>> - http://hbase.apache.org/book/regions.arch.html 9.7.6.5
>>
>> You are almost right. Only thing is that HBase doesn't know when is
>> your offpeak, so a major compaction can be triggered anytime if the
>> minor is promoted to be a major one.
>>
>> JM
>>
>> 2013/6/22 yun peng <[EMAIL PROTECTED]>:
>> > Hi, All
>> >
>> > I am asking the different practices of major and minor compaction... My
>> > current understanding is that minor compaction, triggered automatically,
>> > usually run along with online query serving (but in background), so that
>> it
>> > is important to make it as lightweight as possible... to minimise
>> downtime
>> > (pause time) of online query.
>> >
>> > In contrast, the major compaction is invoked in  offpeak time and usually
>> > can be assume to have resource exclusively. It may have a different
>> > performance optimization goal...
>> >
>> > Correct me if wrong, but let me know if HBase does design different
>> > compaction mechanism this way..?
>> >
>> > Regards,
>> > Yun
>>
+
yun peng 2013-06-22, 14:35
+
Suraj Varma 2013-06-22, 18:51
+
Vladimir Rodionov 2013-06-22, 23:23