-Re: Bulk loading (and/or major compaction) causing OOM
Marcos Ortiz 2012-12-08, 20:42
On 12/08/2012 11:50 AM, Bryan Beaudreault wrote:
> Thanks for the responses guys. Responses inline
>> When you are doing the bulk load, are you pre-split your regions?
>> What OS are you using and what version of Java?
> Yes, regions are pre-split. We calculated them using M/R before attempting
> to bulk load the data. We've done this before with smaller sizes and it
> has worked fine.
> Centos5, java 1.6.0_27
>> Yes, my friend. You should know all the benefits in the new stable
> release (0.94.3), so
>> this is the first advice.
> We use CDH currently, so we are working to move to cdh4.1.2, which is 92.x
Great to hear.
> On Fri, Dec 7, 2012 at 4:48 PM, Stack <[EMAIL PROTECTED]> wrote:
>> On Fri, Dec 7, 2012 at 1:01 PM, Bryan Beaudreault
>> <[EMAIL PROTECTED]>wrote:
>>> We have a couple tables that had thousands of regions due to the size of
>>> the day in them. We recently changed them to have larger regions (nearly
>>> 4GB). We are trying to bulk load these in now, but every time we do our
>>> servers die with OOM.
>> You mean, you are reloading the data that once was in thousands of regions
>> instead into new regions of 4GB in size?
>> I'd be surprised if the actual bulk load brings on the OOME.
> That's correct. The exact same data is currently live in an older table
> with thousands of smaller regions. Once we get these loaded we will swap
> in the new table and delete the old.
>>> The logs seem to show that there is always a major compaction happening
>>> when the OOM happens. This is among other normal usage from a variety of
>>> apps in our product, so the memstores, block cache, etc are all active
>>> during this time.
>> Could you turn off major compaction during the bulk load to see if that
>> Automatic major compactions are actually off for our cluster, it looks
> like they start doing minor compactions as data is loaded in, and that is
> where we first saw the OOM issues. So we tried forcing major compactions
> earlier instead.
>>> I was reading through the compaction code and it doesn't look like it
>>> should take up much memory (depending on how the Reader class works) .
>> Are there lots of storefiles under each region?
>> Yes actually, the bulk loaded data usually seems to contain approximately
> 5-10 files per region. Likely due to the output settings of the M/R job
> that creates this data.
>>> Does anyone with more knowledge of these internals know how it bulk load
>>> and major compaction works with regard to memory?
>>> We are running on ec2 c1.xlarge servers with 5GB of heap, and on hbase
>>> version 0.90.4 (I know, I know, we're working to upgrade).
>> How much have you given hbase?
>> If you look at your cluster monitoring, are you swapping?
>> The regionservers are carrying how many regions per server?
> The RegionServers have 5GB of heap (7.5GB total memory on a c1.xlarge, of
> which 1GB goes to DN and rest to OS)
> Swapping is disabled.
> We have around 350 regions per RS currently. What we're doing now with this
> table is part of our effort to decrease the number of regions across all
> tables. We need to do it with minimal downtime though so it is slow going.
> We are aiming for around 200 regions per RS.
Yes, It would be nice to see less regions by servers. Have you
considered to merge some adjacent
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
Marcos Luis Ortï¿½z Valmaseda
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION