kzurek 2013-02-04, 12:45
Kevin Odell 2013-02-04, 14:46
Thanks for the reply, although I should clear some misunderstandings. In
general, I do know general behavior and difference between minor and major
compaction, as well as when minor compaction might become (could be
promoted) to major compaction. I just wanted to verify influence of
compaction (mostly major) on our cluster performance. Thus, I've created
test where I'm putting data to only one region (total 71) by one single
threaded client using build in caching mechanism (according to "HBase The
Definitive Guide" book) and triggering major compaction by hand
(HBaseAdmin). Although, after few tests I've noticed that major compaction
(Large Compaction) is being triggered (cache flusher, recursive queue) so I
left it as it was (not triggering it anymore). That brought me to this
situation, where I'm putting data and after a while I'm getting timeouts on
the client, in meanwhile I see that memstore is being flush which cant
create new store file (cause there are to many of them) and which is
frequently blocked by compaction process. I hope that this short description
will bring closer look at the issue. In addition, here are some answers to
1. How often are you flushing?
I'm not triggering flushing by hand, but I've noticed that data is being
flushed every 4s (275m) or 1m 30s-40s (1.5g).
2. How often are you force flushing from HLog rolls?
Default settings are: blocksize=64 MB, rollsize=60.8 MB, enabled=true,
optionallogflushinternal=1000ms. It seems that roll is made every hour.
3. What size are your flushes?
Depends, from 275m up to 1.5g. I've set my memstore flush size to 256m and
memstore block multiplier to 6. Should I increase the flush size??
4. What does your region count look like as that can affect your flush size?
Initial split is 37 regions on 6 RegionServers, but at the moment there are
Kevin O'dell wrote
> Just because you turn off time based major compactions, it does not mean
> that you have turned major compaction off. Compaction can still be
> promoted to be Majors. Also, the only real difference between a major and
> minor compaction is one processes deletes. You should really schedule at
> least daily major compactions. As for your blocking issue, there are
> a few things you would want to look at:
> How often are you flushing?
> How often are you force flushing from HLog rolls?
> What size are your flushes?
> What does your region count look like as that can affect your flush size?
> When I see HBase blocking constantly it is usually a sign that you need to
> do some fine grain tuning.
> On Mon, Feb 4, 2013 at 7:45 AM, kzurek <
> > wrote:
>> I'm facing some issues regarding to major compaction. I've disabled major
>> compaction and it is not triggered manually, but when I'm loading data to
>> selected region, I saw that major compaction queue is growing and it is
>> being triggered ('Large Compaction' in logs) quite often (mainly due to
>> cacheFlusher). Moreover, I've noticed that my clients app gets timeout
>> putting data into the cluster (happens when memory store flusher is
>> to dump memory content to store file, but it cannot due to too many store
>> files), also drop in data rate, which in this case is obvious, is
>> noticeable. For me, it looks like compaction process is not fast enough
>> comparing to incoming rate of data or ... maybe something else?? and
>> is blocking the update process.
>> Basic information:
>> HBase Version: 0.92.1, r1298924
>> Hadoop Version: 1.0.3, r1335192
>> 2013-02-01 15:43:14,627 DEBUG
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Large Compaction
>> storeName=data, fileCount=3, fileSize=478.3m (249.8m, 113.7m, 114.7m),
>> priority=-3, time=1051078047102762; Because:
View this message in context: http://apache-hbase.679495.n3.nabble.com/MemStoreFlusher-region-has-too-many-store-files-client-timeout-tp4037887p4037949.html
Sent from the HBase User mailing list archive at Nabble.com.
Kevin Odell 2013-02-05, 14:20
kzurek 2013-02-05, 14:52