-Re: 3-Hour Periodic Network/CPU/Disk/Latency Spikes
Ted Yu 2013-12-13, 23:33
Attachment didn't go through.
On Dec 13, 2013, at 3:18 PM, Patrick Schless <[EMAIL PROTECTED]> wrote:
> Very interesting, I think we may be on to something. I grabbed all the timestamps for major compactions completing and put them on a graph (see attached). Each horizontal line is an individual server, and the dots are when compactions complete. Each server clearly has a cluster of compactions about every 3 hours, and several of the servers are aligned such that they are compacting at the same time.
> Should we be managing these compactions ourselves? Would it make more sense to have them less frequently (but presumably more expensive), or closer together?
> On Fri, Dec 13, 2013 at 2:19 PM, Bryan Beaudreault <[EMAIL PROTECTED]> wrote:
>> Have you taken a look at the logs on the RegionServers during the period?
>> One possibility is compactions happening organically. If you were
>> sustaining a certain level of writes most of the time, I could maybe see
>> that every 3 hours enough store files build up to require compactions.
>> There's nothing else automated in HDFS or HBase that I could see causing
>> On Fri, Dec 13, 2013 at 3:07 PM, Patrick Schless
>> <[EMAIL PROTECTED]>wrote:
>> > CDH4.1.2
>> > HBase 0.92.1
>> > HDFS 2.0.0
>> > Every 3 hours, our production HBase cluster does something that causes all
>> > the data nodes to have a sustained spike in CPU/network/disk. The spike
>> > lasts about 30 mins, and during this time the cluster has greatly increased
>> > latencies for our typical application usage.
>> > I can't find anything in our application that would have such a periodic
>> > and significant behavior. Is there anything that HBase/HDFS might be doing
>> > on it's own that would cause this? We're on the default schedule for major
>> > compactions, but I thought that was daily.
>> > Any ideas what could be causing this?
>> > Thanks,
>> > Patrick