|
|
+
Bryan Beaudreault 2012-12-07, 21:01
+
Stack 2012-12-07, 21:48
+
Bryan Beaudreault 2012-12-08, 16:50
-
Re: Bulk loading (and/or major compaction) causing OOMMarcos Ortiz 2012-12-08, 20:42
On 12/08/2012 11:50 AM, Bryan Beaudreault wrote: > Thanks for the responses guys. Responses inline > >> When you are doing the bulk load, are you pre-split your regions? >> What OS are you using and what version of Java? > Yes, regions are pre-split. We calculated them using M/R before attempting > to bulk load the data. We've done this before with smaller sizes and it > has worked fine. > > Centos5, java 1.6.0_27 > >> Yes, my friend. You should know all the benefits in the new stable > release (0.94.3), so >> this is the first advice. > We use CDH currently, so we are working to move to cdh4.1.2, which is 92.x > branch. Great to hear. > > On Fri, Dec 7, 2012 at 4:48 PM, Stack <[EMAIL PROTECTED]> wrote: > >> On Fri, Dec 7, 2012 at 1:01 PM, Bryan Beaudreault >> <[EMAIL PROTECTED]>wrote: >> >>> We have a couple tables that had thousands of regions due to the size of >>> the day in them. We recently changed them to have larger regions (nearly >>> 4GB). We are trying to bulk load these in now, but every time we do our >>> servers die with OOM. >>> >>> >> You mean, you are reloading the data that once was in thousands of regions >> instead into new regions of 4GB in size? >> >> I'd be surprised if the actual bulk load brings on the OOME. >> >> > That's correct. The exact same data is currently live in an older table > with thousands of smaller regions. Once we get these loaded we will swap > in the new table and delete the old. > > >> >>> The logs seem to show that there is always a major compaction happening >>> when the OOM happens. This is among other normal usage from a variety of >>> apps in our product, so the memstores, block cache, etc are all active >>> during this time. >>> >>> >> Could you turn off major compaction during the bulk load to see if that >> helps? >> >> Automatic major compactions are actually off for our cluster, it looks > like they start doing minor compactions as data is loaded in, and that is > where we first saw the OOM issues. So we tried forcing major compactions > earlier instead. > >> >>> I was reading through the compaction code and it doesn't look like it >>> should take up much memory (depending on how the Reader class works) . >>> >> >> Yes. >> >> Are there lots of storefiles under each region? >> >> Yes actually, the bulk loaded data usually seems to contain approximately > 5-10 files per region. Likely due to the output settings of the M/R job > that creates this data. > > >> >>> Does anyone with more knowledge of these internals know how it bulk load >>> and major compaction works with regard to memory? >>> >>> We are running on ec2 c1.xlarge servers with 5GB of heap, and on hbase >>> version 0.90.4 (I know, I know, we're working to upgrade). >>> >> How much have you given hbase? >> >> If you look at your cluster monitoring, are you swapping? >> >> The regionservers are carrying how many regions per server? >> > The RegionServers have 5GB of heap (7.5GB total memory on a c1.xlarge, of > which 1GB goes to DN and rest to OS) > Swapping is disabled. > We have around 350 regions per RS currently. What we're doing now with this > table is part of our effort to decrease the number of regions across all > tables. We need to do it with minimal downtime though so it is slow going. > We are aiming for around 200 regions per RS. Yes, It would be nice to see less regions by servers. Have you considered to merge some adjacent regions? > >> St.Ack >> > > 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... > CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION > > http://www.uci.cu > http://www.facebook.com/universidad.uci > http://www.flickr.com/photos/universidad_uci -- Marcos Luis Ort�z Valmaseda about.me/marcosortiz <http://about.me/marcosortiz> @marcosluis2186 <http://twitter.com/marcosluis2186> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS... CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION http://www.uci.cu http://www.facebook.com/universidad.uci http://www.flickr.com/photos/universidad_uci +
Bryan Beaudreault 2012-12-08, 21:08
+
Marcos Ortiz 2012-12-07, 21:16
|