Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Re: How to prevent major compaction when doing bulk load provisioning?


Copy link to this message
-
Re: How to prevent major compaction when doing bulk load provisioning?
Nicolas Seyvet 2013-03-22, 07:12
@J-D: Thanks, this sounds very likely.

One more thing, from the logs of one slave, I can see the following:
2013-03-21 22:27:15,041 INFO org.apache.hadoop.hbase.regionserver.Store:
Completed major compaction of 9 file(s) in f of
rc_nise,$,1363860406830.5689430f7a27cc511f99dcb62001edc6. into
5418126f3d154ef3aca8027e04512279, size=8.3g; total size for store is 8.3g
[...]
2013-03-21 23:34:31,836 INFO org.apache.hadoop.hbase.regionserver.Store:
Completed major compaction of 5 file(s) in f of
rc_nise,$,1363860406830.5689430f7a27cc511f99dcb62001edc6. into
3bdeb58c57af4ee1a92d22865e707416, size=8.3g; total size for store is 8.3g

Are not those the sign that a major compaction also occurred?
And if so, what could have triggered it?

On Thu, Mar 21, 2013 at 8:06 PM, Nicolas Seyvet <[EMAIL PROTECTED]>wrote:

> @Ram: You are entirely correct, I made the exact same mistakes of mixing
> up Large and minor compaction.  By looking closely, what I see is that at
> around 200 HFiles per region it starts minor compacting files per group of
> 10 HFiles.  The "problem" seems that this minor compacting never stops even
> when there are about 20 HFiles left.  It just keep on going and on taking
> more and more time (I guess because the files to compact are getting
> bigger).
>
> Of course in parallel we keep on adding more and more data.
>
> @J-D: "It seems to me that it would be better if you were able to do a
> single load for all your files." Yes, I agree.. but that is not what we
> are testing, our use case is to use 1min batch files.
>
>
>
>
>
>