Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - difference between major and minor compactions?


Copy link to this message
-
Re: difference between major and minor compactions?
yun peng 2013-06-22, 14:35
I am more concerned with CompactionPolicy available that allows application
to manipulate a bit how compaction should go... It looks like  there is
newest API in .97 version
*ExploringCompactionPolicy*<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/compactions/ExploringCompactionPolicy.html>,
which allow application when we should have a major compaction.

For stripe compaction, it is very interesting, will look into it. Thanks.
Yun
On Sat, Jun 22, 2013 at 9:24 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi Yun,
>
> There is more differences.
>
> The minor compactions are not remove the delete flags and the deleted
> cells. It only merge the small files into a bigger one. Only the major
> compaction (in 0.94) will deal with the delete cells. There is also
> some more compaction mechanism coming in trunk with nice features.
>
> Look at: https://issues.apache.org/jira/browse/HBASE-7902
> https://issues.apache.org/jira/browse/HBASE-7680
> https://issues.apache.org/jira/browse/HBASE-7680
>
> Minor compactions are promoted to major compactions when the
> compaction policy decide to compact all the files. If all the files
> need to be merged, then we can run a major compaction which will do
> the same thing as the minor one, but with the bonus of deleting the
> required marked cells.
>
> JM
>
> 2013/6/22 yun peng <[EMAIL PROTECTED]>:
> > Thanks, JM
> > It seems like the sole difference btwn major and minor compaction is the
> > number of files (to be all or just a subset of storefiles). It mentioned
> > very briefly in
> > http://hbase.apache.org/book<
> http://hbase.apache.org/book/regions.arch.html>that
> > "Sometimes a minor compaction will ... promote itself to being a major
> > compaction". What does "sometime" exactly mean here? or any policy in
> HBase
> > that allow application to specify when to promote a minor compaction to
> be
> > a major (like user or some monitoring service can specify now is offpeak
> > time?)
> > Yun
> >
> >
> >
> > On Sat, Jun 22, 2013 at 8:51 AM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Hi Yun,
> >>
> >> Few links:
> >> - http://blog.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/
> >> => There is a small paragraph about compactions which explain when
> >> they are triggered.
> >> - http://hbase.apache.org/book/regions.arch.html 9.7.6.5
> >>
> >> You are almost right. Only thing is that HBase doesn't know when is
> >> your offpeak, so a major compaction can be triggered anytime if the
> >> minor is promoted to be a major one.
> >>
> >> JM
> >>
> >> 2013/6/22 yun peng <[EMAIL PROTECTED]>:
> >> > Hi, All
> >> >
> >> > I am asking the different practices of major and minor compaction...
> My
> >> > current understanding is that minor compaction, triggered
> automatically,
> >> > usually run along with online query serving (but in background), so
> that
> >> it
> >> > is important to make it as lightweight as possible... to minimise
> >> downtime
> >> > (pause time) of online query.
> >> >
> >> > In contrast, the major compaction is invoked in  offpeak time and
> usually
> >> > can be assume to have resource exclusively. It may have a different
> >> > performance optimization goal...
> >> >
> >> > Correct me if wrong, but let me know if HBase does design different
> >> > compaction mechanism this way..?
> >> >
> >> > Regards,
> >> > Yun
> >>
>