Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> [HBase 0.92.1] Too many stores files to compact, compaction moving slowly


Copy link to this message
-
RE: [HBase 0.92.1] Too many stores files to compact, compaction moving slowly
It was a testscenario so currently we have not thought of any work around
Coming to the question of store files growing,
I currently don't remember the size of the store files that got created that
time. I will check that and get back to you.  But it's a worthy one to look
at as you say individual files growing in size is not normal.

Regards
Ram
> -----Original Message-----
> From: Shrijeet Paliwal [mailto:[EMAIL PROTECTED]]
> Sent: Monday, May 14, 2012 10:50 AM
> To: [EMAIL PROTECTED]
> Subject: Re: [HBase 0.92.1] Too many stores files to compact,
> compaction moving slowly
>
> Hello Ram,
> https://issues.apache.org/jira/browse/HBASE-5161 does sound like it.
> First
> it was a heavy write scenario, second the region server was low on
> memory.
> In your case of 400GB region. What was the size of individual store
> files?
> Were they big as well?  While I can see why number of store files will
> grow, I am not able to understand why size of individual store files
> keep
> growing.
>
> Lastly what did you do with your 400 GB region? Any work around ?
>
> -Shrijeet
>
> On Sun, May 13, 2012 at 9:29 PM, Ramkrishna.S.Vasudevan <
> [EMAIL PROTECTED]> wrote:
> >
> > Hi Shrijeet
> > Regarding your last question about the region growing bigger
> > The following points could be one reason
> >
> > When you said your compactions are slower and also you were trying to
> split
> > some very big store files, every split would have created some set of
> > reference files.
> > By the time as more writes are happening more store files are
> flushed.
> >
> > In the compaction algo, whenever reference files are found those
> files
> will
> > be tried to compact. But what happens is the though there are
> reference
> > files we try to  take the latest files to compact and the reference
> files
> > keeps losing the race in getting compacted i.e they are priority is
> going
> > down.
> >
> > Pls refer to HBASE-5161.  It could be your case. In our case the
> region
> > infact went upto 400GB, but it was a heavy write scenario.
> >
> > Regards
> > Ram
> >
> >
> > > -----Original Message-----
> > > From: Shrijeet Paliwal [mailto:[EMAIL PROTECTED]]
> > > Sent: Monday, May 14, 2012 4:43 AM
> > > To: [EMAIL PROTECTED]
> > > Subject: [HBase 0.92.1] Too many stores files to compact,
> compaction
> > > moving slowly
> > >
> > > Hi,
> > >
> > > HBase version : 0.92.1
> > > Hadoop version: 0.20.2-cdh3u0
> > >
> > > Relavant configurations:
> > > * hbase.regionserver.fileSplitTimeout : 300000
> > > * hbase.hstore.compactionThreshold : 3
> > > * hbase.hregion.max.filesize : 2147483648
> > > * hbase.hstore.compaction.max : 10
> > > * hbase.hregion.majorcompaction: 864000000000
> > > * HBASE_HEAPSIZE : 4000
> > >
> > > Some how[1] a user has got his table into a complicated state. The
> > > table
> > > has 299 regions out of which roughly 28 regions have huge amount of
> > > store
> > > files in them, as high as 2300 (snapshot
> > > http://pastie.org/pastes/3907336/text) files! To add to
> complication
> > > the individual store files are as big as 14GB.
> > >
> > > Now I am in pursuit of balancing the data in this table.  I tried
> doing
> > > manual splits. But the split requests were failing with error "Took
> too
> > > long to split the files and create the references, aborting split".
> > > To get around I increased hbase.regionserver.fileSplitTimeout.
> > >
> > > From this point splits happend. I went ahead and identified 10
> regions
> > > which had too many store files and did split on them. After splits
> > > daughter
> > > regions were created with references to all the store files in the
> > > parent
> > > region and compactions started happening. The minor compaction
> > > threshold is
> > > 10. Since there are 2000 + files (taking one instance for example)
> it
> > > will
> > > do 200 sweeps of minor compaction.
> > > Each sweep is running slow(couple of hours), since the individual
> files
> > > (in
> > > the set of 10) are too big.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB