|
Wayne
2011-01-11, 22:34
Ted Dunning
2011-01-11, 22:51
Wayne
2011-01-11, 22:57
Ted Dunning
2011-01-11, 23:31
Friso van Vollenhoven
2011-01-12, 09:35
Wayne
2011-01-12, 12:50
Friso van Vollenhoven
2011-01-12, 12:58
Stack
2011-01-12, 16:17
Stack
2011-01-12, 16:20
Ted Dunning
2011-01-12, 16:24
Ted Dunning
2011-01-12, 16:24
Wayne
2011-01-12, 16:40
Wayne
2011-01-12, 17:03
Stack
2011-01-12, 18:52
Wayne
2011-01-12, 19:52
Stack
2011-01-12, 19:57
Ted Dunning
2011-01-12, 21:29
Tatsuya Kawano
2011-01-13, 11:59
Wayne
2011-01-13, 12:49
Wayne
2011-01-14, 14:34
Christopher Tarnas
2011-01-14, 15:28
Jonathan Gray
2011-01-14, 17:29
Christopher Tarnas
2011-01-14, 17:53
Wayne
2011-01-14, 18:02
Stack
2011-01-14, 18:05
Geoff Hendrey
2011-01-14, 18:06
Todd Lipcon
2011-01-27, 03:51
Wayne
2011-01-27, 14:54
|
-
Cluster Wide PausesWayne 2011-01-11, 22:34
We have very frequent cluster wide pauses that stop all reads and writes
for seconds. We are constantly loading data to this cluster of 10 nodes. These pauses can happen as frequently as every minute but sometimes are not seen for 15+ minutes. Basically watching the Region server list with request counts is the only evidence of what is going on. All reads and writes totally stop and if there is ever any activity it is on the node hosting the .META. table with a request count of region count + 1. This problem seems to be worse with a larger region size. We tried a 1GB region size and saw this more than we saw actual activity (and stopped using a larger region size because of it). We went back to the default region size and it was better, but we had too many regions so now we are up to 512M for a region size and we are seeing it more again. Does anyone know what this is? We have dug into all of the logs to find some sort of pause but are not able to find anything. Is this an wal hlog roll? Is this a region split or compaction? Of course our biggest fear is a GC pause on the master but we do not have java logging turned on with the master to tell. What could possibly stop the entire cluster from working for seconds at a time very frequently? Thanks in advance for any ideas of what could be causing this.
-
Re: Cluster Wide PausesTed Dunning 2011-01-11, 22:51
I am running YCSB to load a modest amount of data (2 M rows) and see
something similar. For example: 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT AverageLatency(ms)=0.84] 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT AverageLatency(ms)=3.51] 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT AverageLatency(ms)=0.25] 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT AverageLatency(ms)=3.02] * 50 sec: 348603 operations; 0 current ops/sec; * 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT AverageLatency(ms)=3.59] 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT AverageLatency(ms)=1.88] 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT AverageLatency(ms)=2.53] 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT AverageLatency(ms)=1.49] 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT AverageLatency(ms)=2.71] 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT AverageLatency(ms)=2.26] 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT AverageLatency(ms)=2.48] 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT AverageLatency(ms)=1.56] * 140 sec: 996641 operations; 0 current ops/sec; * 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT AverageLatency(ms)=3.81] 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT AverageLatency(ms)=3.08] 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT AverageLatency(ms)=0.04] * 180 sec: 1209919 operations; 0 current ops/sec; * 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT AverageLatency(ms)=10.25] 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT AverageLatency(ms)=0.43] 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT AverageLatency(ms)=2.39] * 220 sec: 1431400 operations; 0 current ops/sec; * 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT AverageLatency(ms)=9.58] 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT AverageLatency(ms)=0.29] 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT AverageLatency(ms)=3.86] 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT AverageLatency(ms)=1.35] 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT AverageLatency(ms)=1.39] 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT AverageLatency(ms)=1.25] 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT AverageLatency(ms)=1.41] This is with 0.89 running on 5 region servers on a non-hdfs append-enabled substrate. I am logging GC's on both master and regions and I don't see a correlation between either. I am still verifying that, so it could be wrong, but I am pretty sure not. What I do see is lots of small compactions that look roughly like this: {Heap before GC invocations=181 (full 23): par new generation total 19136K, used 17895K [0x00007f9aa40a0000, 0x00007f9aa5560000, 0x00007f9aabda0000) eden space 17024K, 99% used [0x00007f9aa40a0000, 0x00007f9aa513b058, 0x00007f9aa5140000) from space 2112K, 42% used [0x00007f9aa5350000, 0x00007f9aa542ef38, 0x00007f9aa5560000) to space 2112K, 0% used [0x00007f9aa5140000, 0x00007f9aa5140000, 0x00007f9aa5350000) concurrent mark-sweep generation total 252700K, used 83769K [0x00007f9aabda0000, 0x00007f9abb467000, 0x00007f9ae28a0000) concurrent-mark-sweep perm gen total 29120K, used 17377K [0x00007f9ae28a0000, 0x00007f9ae4510000, 0x00007f9ae7ca0000) 542.472: [GC 542.472: [ParNew Desired survivor size 1081344 bytes, new threshold 4 (max 4) - age 1: 274408 bytes, 274408 total - age 2: 47352 bytes, 321760 total - age 3: 10304 bytes, 332064 total - age 4: 10472 bytes, 342536 total : 17895K->510K(19136K), 0.0586100 secs] 101665K->84784K(271836K) icms_dc=1 , 0.0586740 secs] [Times: user=0.36 sys=0.00, real=0.06 secs] Heap after GC invocations=182 (full 23): par new generation total 19136K, used 510K [0x00007f9aa40a0000, 0x00007f9aa5560000, 0x00007f9aabda0000) eden space 17024K, 0% used [0x00007f9aa40a0000, 0x00007f9aa40a0000, 0x00007f9aa5140000) from space 2112K, 24% used [0x00007f9aa5140000, 0x00007f9aa51bfa98, 0x00007f9aa5350000) to space 2112K, 0% used [0x00007f9aa5350000, 0x00007f9aa5350000, 0x00007f9aa5560000) concurrent mark-sweep generation total 252700K, used 84274K [0x00007f9aabda0000, 0x00007f9abb467000, 0x00007f9ae28a0000) concurrent-mark-sweep perm gen total 29120K, used 17377K [0x00007f9ae28a0000, 0x00007f9ae4510000, 0x00007f9ae7ca0000) } 542.730: [CMS-concurrent-mark: 0.060/2.344 secs] [Times: user=1.48 sys=0.05, real=2.35 secs] 542.730: [CMS-concurrent-preclean-start] 542.744: [CMS-concurrent-preclean: 0.014/0.014 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 542.744: [CMS-concurrent-abortable-preclean-start] It is unusual to see things last 4 generations. There are clearly full collections going on, but not a lot of them and the heap is small enough on this system that it should be fast. I will be gather more information. On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote:
-
Re: Cluster Wide PausesWayne 2011-01-11, 22:57
What is shared across all nodes that could stop everything? Originally I
suspected the node with the .META. table and GC pauses but could never find the smoking gun. What JVM are you running? We use u23 and I am starting to consider rolling back to u16 per the previous post. On Tue, Jan 11, 2011 at 5:51 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > I am running YCSB to load a modest amount of data (2 M rows) and see > something similar. > > For example: > > 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT > AverageLatency(ms)=0.84] > 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT > AverageLatency(ms)=3.51] > 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT > AverageLatency(ms)=0.25] > 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT > AverageLatency(ms)=3.02] > * 50 sec: 348603 operations; 0 current ops/sec; * > 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT > AverageLatency(ms)=3.59] > 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT > AverageLatency(ms)=1.88] > 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT > AverageLatency(ms)=2.53] > 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT > AverageLatency(ms)=1.49] > 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT > AverageLatency(ms)=2.71] > 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT > AverageLatency(ms)=2.26] > 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT > AverageLatency(ms)=2.48] > 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT > AverageLatency(ms)=1.56] > * 140 sec: 996641 operations; 0 current ops/sec; * > 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT > AverageLatency(ms)=3.81] > 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT > AverageLatency(ms)=3.08] > 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT > AverageLatency(ms)=0.04] > * 180 sec: 1209919 operations; 0 current ops/sec; * > 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT > AverageLatency(ms)=10.25] > 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT > AverageLatency(ms)=0.43] > 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT > AverageLatency(ms)=2.39] > * 220 sec: 1431400 operations; 0 current ops/sec; * > 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT > AverageLatency(ms)=9.58] > 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT > AverageLatency(ms)=0.29] > 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT > AverageLatency(ms)=3.86] > 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT > AverageLatency(ms)=1.35] > 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT > AverageLatency(ms)=1.39] > 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT > AverageLatency(ms)=1.25] > 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT > AverageLatency(ms)=1.41] > > > This is with 0.89 running on 5 region servers on a non-hdfs append-enabled > substrate. > > I am logging GC's on both master and regions and I don't see a correlation > between either. I am still verifying that, so it could be wrong, but I am > pretty sure not. What I do see is lots of small compactions that look > roughly like this: > > {Heap before GC invocations=181 (full 23): > par new generation total 19136K, used 17895K [0x00007f9aa40a0000, > 0x00007f9aa5560000, 0x00007f9aabda0000) > eden space 17024K, 99% used [0x00007f9aa40a0000, 0x00007f9aa513b058, > 0x00007f9aa5140000) > from space 2112K, 42% used [0x00007f9aa5350000, 0x00007f9aa542ef38, > 0x00007f9aa5560000) > to space 2112K, 0% used [0x00007f9aa5140000, 0x00007f9aa5140000, > 0x00007f9aa5350000) > concurrent mark-sweep generation total 252700K, used 83769K > [0x00007f9aabda0000, 0x00007f9abb467000, 0x00007f9ae28a0000) > concurrent-mark-sweep perm gen total 29120K, used 17377K > [0x00007f9ae28a0000, 0x00007f9ae4510000, 0x00007f9ae7ca0000) > 542.472: [GC 542.472: [ParNew
-
Re: Cluster Wide PausesTed Dunning 2011-01-11, 23:31
java version "1.6.0_20"
Java(TM) SE Runtime Environment (build 1.6.0_20-b02) Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) On Tue, Jan 11, 2011 at 2:57 PM, Wayne <[EMAIL PROTECTED]> wrote: > What is shared across all nodes that could stop everything? Originally I > suspected the node with the .META. table and GC pauses but could never find > the smoking gun. What JVM are you running? We use u23 and I am starting to > consider rolling back to u16 per the previous post. > > On Tue, Jan 11, 2011 at 5:51 PM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > > I am running YCSB to load a modest amount of data (2 M rows) and see > > something similar. > > > > For example: > > > > 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT > > AverageLatency(ms)=0.84] > > 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT > > AverageLatency(ms)=3.51] > > 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT > > AverageLatency(ms)=0.25] > > 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT > > AverageLatency(ms)=3.02] > > * 50 sec: 348603 operations; 0 current ops/sec; * > > 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT > > AverageLatency(ms)=3.59] > > 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT > > AverageLatency(ms)=1.88] > > 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT > > AverageLatency(ms)=2.53] > > 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT > > AverageLatency(ms)=1.49] > > 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT > > AverageLatency(ms)=2.71] > > 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT > > AverageLatency(ms)=2.26] > > 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT > > AverageLatency(ms)=2.48] > > 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT > > AverageLatency(ms)=1.56] > > * 140 sec: 996641 operations; 0 current ops/sec; * > > 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT > > AverageLatency(ms)=3.81] > > 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT > > AverageLatency(ms)=3.08] > > 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT > > AverageLatency(ms)=0.04] > > * 180 sec: 1209919 operations; 0 current ops/sec; * > > 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT > > AverageLatency(ms)=10.25] > > 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT > > AverageLatency(ms)=0.43] > > 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT > > AverageLatency(ms)=2.39] > > * 220 sec: 1431400 operations; 0 current ops/sec; * > > 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT > > AverageLatency(ms)=9.58] > > 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT > > AverageLatency(ms)=0.29] > > 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT > > AverageLatency(ms)=3.86] > > 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT > > AverageLatency(ms)=1.35] > > 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT > > AverageLatency(ms)=1.39] > > 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT > > AverageLatency(ms)=1.25] > > 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT > > AverageLatency(ms)=1.41] > > > > > > This is with 0.89 running on 5 region servers on a non-hdfs > append-enabled > > substrate. > > > > I am logging GC's on both master and regions and I don't see a > correlation > > between either. I am still verifying that, so it could be wrong, but I > am > > pretty sure not. What I do see is lots of small compactions that look > > roughly like this: > > > > {Heap before GC invocations=181 (full 23): > > par new generation total 19136K, used 17895K [0x00007f9aa40a0000, > > 0x00007f9aa5560000, 0x00007f9aabda0000) > > eden space 17024K, 99% used [0x00007f9aa40a0000, 0x00007f9aa513b058, > > 0x00007f9aa5140000) > > from space 2112K, 42% used [0x00007f9aa5350000, 0x00007f9aa542ef38,
-
Re: Cluster Wide PausesFriso van Vollenhoven 2011-01-12, 09:35
We have been on u23 for some time and are doing fine with HBase 0.89 and CDH3b2. Also with write heavy loads. We see RS activity drop around memstore flushes, compactions and especially splits.
Friso On 11 jan 2011, at 23:57, Wayne wrote: > What is shared across all nodes that could stop everything? Originally I > suspected the node with the .META. table and GC pauses but could never find > the smoking gun. What JVM are you running? We use u23 and I am starting to > consider rolling back to u16 per the previous post. > > On Tue, Jan 11, 2011 at 5:51 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > >> I am running YCSB to load a modest amount of data (2 M rows) and see >> something similar. >> >> For example: >> >> 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT >> AverageLatency(ms)=0.84] >> 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT >> AverageLatency(ms)=3.51] >> 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT >> AverageLatency(ms)=0.25] >> 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT >> AverageLatency(ms)=3.02] >> * 50 sec: 348603 operations; 0 current ops/sec; * >> 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT >> AverageLatency(ms)=3.59] >> 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT >> AverageLatency(ms)=1.88] >> 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT >> AverageLatency(ms)=2.53] >> 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT >> AverageLatency(ms)=1.49] >> 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT >> AverageLatency(ms)=2.71] >> 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT >> AverageLatency(ms)=2.26] >> 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT >> AverageLatency(ms)=2.48] >> 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT >> AverageLatency(ms)=1.56] >> * 140 sec: 996641 operations; 0 current ops/sec; * >> 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT >> AverageLatency(ms)=3.81] >> 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT >> AverageLatency(ms)=3.08] >> 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT >> AverageLatency(ms)=0.04] >> * 180 sec: 1209919 operations; 0 current ops/sec; * >> 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT >> AverageLatency(ms)=10.25] >> 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT >> AverageLatency(ms)=0.43] >> 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT >> AverageLatency(ms)=2.39] >> * 220 sec: 1431400 operations; 0 current ops/sec; * >> 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT >> AverageLatency(ms)=9.58] >> 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT >> AverageLatency(ms)=0.29] >> 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT >> AverageLatency(ms)=3.86] >> 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT >> AverageLatency(ms)=1.35] >> 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT >> AverageLatency(ms)=1.39] >> 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT >> AverageLatency(ms)=1.25] >> 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT >> AverageLatency(ms)=1.41] >> >> >> This is with 0.89 running on 5 region servers on a non-hdfs append-enabled >> substrate. >> >> I am logging GC's on both master and regions and I don't see a correlation >> between either. I am still verifying that, so it could be wrong, but I am >> pretty sure not. What I do see is lots of small compactions that look >> roughly like this: >> >> {Heap before GC invocations=181 (full 23): >> par new generation total 19136K, used 17895K [0x00007f9aa40a0000, >> 0x00007f9aa5560000, 0x00007f9aabda0000) >> eden space 17024K, 99% used [0x00007f9aa40a0000, 0x00007f9aa513b058, >> 0x00007f9aa5140000) >> from space 2112K, 42% used [0x00007f9aa5350000, 0x00007f9aa542ef38, >> 0x00007f9aa5560000) >> to space 2112K, 0% used [0x00007f9aa5140000, 0x00007f9aa5140000,
-
Re: Cluster Wide PausesWayne 2011-01-12, 12:50
Do you see pauses across all nodes at the same time? Does a split pause *ALL
* nodes and if so why? On Wed, Jan 12, 2011 at 4:35 AM, Friso van Vollenhoven < [EMAIL PROTECTED]> wrote: > We have been on u23 for some time and are doing fine with HBase 0.89 and > CDH3b2. Also with write heavy loads. We see RS activity drop around memstore > flushes, compactions and especially splits. > > Friso > > > > On 11 jan 2011, at 23:57, Wayne wrote: > > > What is shared across all nodes that could stop everything? Originally I > > suspected the node with the .META. table and GC pauses but could never > find > > the smoking gun. What JVM are you running? We use u23 and I am starting > to > > consider rolling back to u16 per the previous post. > > > > On Tue, Jan 11, 2011 at 5:51 PM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > > >> I am running YCSB to load a modest amount of data (2 M rows) and see > >> something similar. > >> > >> For example: > >> > >> 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT > >> AverageLatency(ms)=0.84] > >> 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT > >> AverageLatency(ms)=3.51] > >> 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT > >> AverageLatency(ms)=0.25] > >> 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT > >> AverageLatency(ms)=3.02] > >> * 50 sec: 348603 operations; 0 current ops/sec; * > >> 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT > >> AverageLatency(ms)=3.59] > >> 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT > >> AverageLatency(ms)=1.88] > >> 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT > >> AverageLatency(ms)=2.53] > >> 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT > >> AverageLatency(ms)=1.49] > >> 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT > >> AverageLatency(ms)=2.71] > >> 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT > >> AverageLatency(ms)=2.26] > >> 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT > >> AverageLatency(ms)=2.48] > >> 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT > >> AverageLatency(ms)=1.56] > >> * 140 sec: 996641 operations; 0 current ops/sec; * > >> 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT > >> AverageLatency(ms)=3.81] > >> 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT > >> AverageLatency(ms)=3.08] > >> 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT > >> AverageLatency(ms)=0.04] > >> * 180 sec: 1209919 operations; 0 current ops/sec; * > >> 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT > >> AverageLatency(ms)=10.25] > >> 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT > >> AverageLatency(ms)=0.43] > >> 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT > >> AverageLatency(ms)=2.39] > >> * 220 sec: 1431400 operations; 0 current ops/sec; * > >> 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT > >> AverageLatency(ms)=9.58] > >> 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT > >> AverageLatency(ms)=0.29] > >> 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT > >> AverageLatency(ms)=3.86] > >> 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT > >> AverageLatency(ms)=1.35] > >> 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT > >> AverageLatency(ms)=1.39] > >> 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT > >> AverageLatency(ms)=1.25] > >> 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT > >> AverageLatency(ms)=1.41] > >> > >> > >> This is with 0.89 running on 5 region servers on a non-hdfs > append-enabled > >> substrate. > >> > >> I am logging GC's on both master and regions and I don't see a > correlation > >> between either. I am still verifying that, so it could be wrong, but I > am > >> pretty sure not. What I do see is lots of small compactions that look > >> roughly like this: > >> > >> {Heap before GC invocations=181 (full 23):
-
Re: Cluster Wide PausesFriso van Vollenhoven 2011-01-12, 12:58
A split should not cause all nodes to halt. But if you start a job that has good distribution on a otherwise clean cluster, it can appear that way, because they happen roughly at the same time. I have seen that some times.
Friso On 12 jan 2011, at 13:50, Wayne wrote: > Do you see pauses across all nodes at the same time? Does a split pause *ALL > * nodes and if so why? > > On Wed, Jan 12, 2011 at 4:35 AM, Friso van Vollenhoven < > [EMAIL PROTECTED]> wrote: > >> We have been on u23 for some time and are doing fine with HBase 0.89 and >> CDH3b2. Also with write heavy loads. We see RS activity drop around memstore >> flushes, compactions and especially splits. >> >> Friso >> >> >> >> On 11 jan 2011, at 23:57, Wayne wrote: >> >>> What is shared across all nodes that could stop everything? Originally I >>> suspected the node with the .META. table and GC pauses but could never >> find >>> the smoking gun. What JVM are you running? We use u23 and I am starting >> to >>> consider rolling back to u16 per the previous post. >>> >>> On Tue, Jan 11, 2011 at 5:51 PM, Ted Dunning <[EMAIL PROTECTED]> >> wrote: >>> >>>> I am running YCSB to load a modest amount of data (2 M rows) and see >>>> something similar. >>>> >>>> For example: >>>> >>>> 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT >>>> AverageLatency(ms)=0.84] >>>> 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT >>>> AverageLatency(ms)=3.51] >>>> 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT >>>> AverageLatency(ms)=0.25] >>>> 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT >>>> AverageLatency(ms)=3.02] >>>> * 50 sec: 348603 operations; 0 current ops/sec; * >>>> 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT >>>> AverageLatency(ms)=3.59] >>>> 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT >>>> AverageLatency(ms)=1.88] >>>> 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT >>>> AverageLatency(ms)=2.53] >>>> 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT >>>> AverageLatency(ms)=1.49] >>>> 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT >>>> AverageLatency(ms)=2.71] >>>> 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT >>>> AverageLatency(ms)=2.26] >>>> 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT >>>> AverageLatency(ms)=2.48] >>>> 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT >>>> AverageLatency(ms)=1.56] >>>> * 140 sec: 996641 operations; 0 current ops/sec; * >>>> 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT >>>> AverageLatency(ms)=3.81] >>>> 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT >>>> AverageLatency(ms)=3.08] >>>> 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT >>>> AverageLatency(ms)=0.04] >>>> * 180 sec: 1209919 operations; 0 current ops/sec; * >>>> 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT >>>> AverageLatency(ms)=10.25] >>>> 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT >>>> AverageLatency(ms)=0.43] >>>> 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT >>>> AverageLatency(ms)=2.39] >>>> * 220 sec: 1431400 operations; 0 current ops/sec; * >>>> 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT >>>> AverageLatency(ms)=9.58] >>>> 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT >>>> AverageLatency(ms)=0.29] >>>> 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT >>>> AverageLatency(ms)=3.86] >>>> 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT >>>> AverageLatency(ms)=1.35] >>>> 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT >>>> AverageLatency(ms)=1.39] >>>> 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT >>>> AverageLatency(ms)=1.25] >>>> 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT >>>> AverageLatency(ms)=1.41] >>>> >>>> >>>> This is with 0.89 running on 5 region servers on a non-hdfs >> append-enabled >>
-
Re: Cluster Wide PausesStack 2011-01-12, 16:17
On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote:
> We have very frequent cluster wide pauses that stop all reads and writes > for seconds. All reads and all writes? I've seen the pause too for writes. Its something I've always meant to look into. Friso postulates one cause. Another that we've talked of is a region taking a while to come back on line after a split or a rebalance for whatever reason. Client loading might be 'random' spraying over lots of random regions but they all get stuck waiting on one particular region to come back online. I suppose reads could be blocked for same reason if all are trying to read from the offlined region. What version of hbase are you using? Splits should be faster in 0.90 now that the split daughters come up on the same region. Sorry I don't have a better answer for you. Need to dig in. File a JIRA. If you want to help out some, stick some data up in it. Some suggestions would be to enable logging of when we lookup region locations in client and then note when requests go to zero. Can you figure what region the clients are waiting on (if they are waiting on any). If you can pull out a particular one, try and elicit its history at time of blockage. Is it being moved or mid-split? I suppose it makes sense that bigger regions would make the situation 'worse'. I can take a look at it too. St.Ack We are constantly loading data to this cluster of 10 nodes. > These pauses can happen as frequently as every minute but sometimes are not > seen for 15+ minutes. Basically watching the Region server list with request > counts is the only evidence of what is going on. All reads and writes > totally stop and if there is ever any activity it is on the node hosting the > .META. table with a request count of region count + 1. This problem seems to > be worse with a larger region size. We tried a 1GB region size and saw this > more than we saw actual activity (and stopped using a larger region size > because of it). We went back to the default region size and it was better, > but we had too many regions so now we are up to 512M for a region size and > we are seeing it more again. > > Does anyone know what this is? We have dug into all of the logs to find some > sort of pause but are not able to find anything. Is this an wal hlog roll? > Is this a region split or compaction? Of course our biggest fear is a GC > pause on the master but we do not have java logging turned on with the > master to tell. What could possibly stop the entire cluster from working for > seconds at a time very frequently? > > Thanks in advance for any ideas of what could be causing this. >
-
Re: Cluster Wide PausesStack 2011-01-12, 16:20
Its interesting that you are running into this issue Ted on a
'non-hdfs append-enabled substrate'. This would seem to rule out 'slow hdfs' as culprit. St.Ack On Tue, Jan 11, 2011 at 2:51 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > I am running YCSB to load a modest amount of data (2 M rows) and see > something similar. > > For example: > > 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT > AverageLatency(ms)=0.84] > 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT > AverageLatency(ms)=3.51] > 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT > AverageLatency(ms)=0.25] > 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT > AverageLatency(ms)=3.02] > * 50 sec: 348603 operations; 0 current ops/sec; * > 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT > AverageLatency(ms)=3.59] > 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT > AverageLatency(ms)=1.88] > 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT > AverageLatency(ms)=2.53] > 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT > AverageLatency(ms)=1.49] > 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT > AverageLatency(ms)=2.71] > 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT > AverageLatency(ms)=2.26] > 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT > AverageLatency(ms)=2.48] > 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT > AverageLatency(ms)=1.56] > * 140 sec: 996641 operations; 0 current ops/sec; * > 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT > AverageLatency(ms)=3.81] > 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT > AverageLatency(ms)=3.08] > 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT > AverageLatency(ms)=0.04] > * 180 sec: 1209919 operations; 0 current ops/sec; * > 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT > AverageLatency(ms)=10.25] > 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT > AverageLatency(ms)=0.43] > 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT > AverageLatency(ms)=2.39] > * 220 sec: 1431400 operations; 0 current ops/sec; * > 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT > AverageLatency(ms)=9.58] > 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT > AverageLatency(ms)=0.29] > 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT > AverageLatency(ms)=3.86] > 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT > AverageLatency(ms)=1.35] > 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT > AverageLatency(ms)=1.39] > 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT > AverageLatency(ms)=1.25] > 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT > AverageLatency(ms)=1.41] > > > This is with 0.89 running on 5 region servers on a non-hdfs append-enabled > substrate. > > I am logging GC's on both master and regions and I don't see a correlation > between either. I am still verifying that, so it could be wrong, but I am > pretty sure not. What I do see is lots of small compactions that look > roughly like this: > > {Heap before GC invocations=181 (full 23): > par new generation total 19136K, used 17895K [0x00007f9aa40a0000, > 0x00007f9aa5560000, 0x00007f9aabda0000) > eden space 17024K, 99% used [0x00007f9aa40a0000, 0x00007f9aa513b058, > 0x00007f9aa5140000) > from space 2112K, 42% used [0x00007f9aa5350000, 0x00007f9aa542ef38, > 0x00007f9aa5560000) > to space 2112K, 0% used [0x00007f9aa5140000, 0x00007f9aa5140000, > 0x00007f9aa5350000) > concurrent mark-sweep generation total 252700K, used 83769K > [0x00007f9aabda0000, 0x00007f9abb467000, 0x00007f9ae28a0000) > concurrent-mark-sweep perm gen total 29120K, used 17377K > [0x00007f9ae28a0000, 0x00007f9ae4510000, 0x00007f9ae7ca0000) > 542.472: [GC 542.472: [ParNew > Desired survivor size 1081344 bytes, new threshold 4 (max 4) > - age 1: 274408 bytes, 274408 total
-
Re: Cluster Wide PausesTed Dunning 2011-01-12, 16:24
I need to collect more numbers to verify that assertion, but, yes, it would
seem to do that. On Wed, Jan 12, 2011 at 8:20 AM, Stack <[EMAIL PROTECTED]> wrote: > Its interesting that you are running into this issue Ted on a > 'non-hdfs append-enabled substrate'. This would seem to rule out > 'slow hdfs' as culprit. > > St.Ack > > On Tue, Jan 11, 2011 at 2:51 PM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > > I am running YCSB to load a modest amount of data (2 M rows) and see > > something similar. > > > > For example: > > > > 10 sec: 135327 operations; 13505.69 current ops/sec; [INSERT > > AverageLatency(ms)=0.84] > > 20 sec: 204440 operations; 6883.08 current ops/sec; [INSERT > > AverageLatency(ms)=3.51] > > 30 sec: 250166 operations; 4572.14 current ops/sec; [INSERT > > AverageLatency(ms)=0.25] > > 40 sec: 348603 operations; 9843.7 current ops/sec; [INSERT > > AverageLatency(ms)=3.02] > > * 50 sec: 348603 operations; 0 current ops/sec; * > > 60 sec: 447040 operations; 9821.11 current ops/sec; [INSERT > > AverageLatency(ms)=3.59] > > 70 sec: 504461 operations; 5741.53 current ops/sec; [INSERT > > AverageLatency(ms)=1.88] > > 80 sec: 598797 operations; 9433.6 current ops/sec; [INSERT > > AverageLatency(ms)=2.53] > > 90 sec: 676724 operations; 7791.92 current ops/sec; [INSERT > > AverageLatency(ms)=1.49] > > 100 sec: 758754 operations; 8202.18 current ops/sec; [INSERT > > AverageLatency(ms)=2.71] > > 110 sec: 779263 operations; 2050.9 current ops/sec; [INSERT > > AverageLatency(ms)=2.26] > > 120 sec: 890004 operations; 11060.83 current ops/sec; [INSERT > > AverageLatency(ms)=2.48] > > 130 sec: 996641 operations; 10662.63 current ops/sec; [INSERT > > AverageLatency(ms)=1.56] > > * 140 sec: 996641 operations; 0 current ops/sec; * > > 150 sec: 1111600 operations; 11494.75 current ops/sec; [INSERT > > AverageLatency(ms)=3.81] > > 160 sec: 1175494 operations; 6371.56 current ops/sec; [INSERT > > AverageLatency(ms)=3.08] > > 170 sec: 1209919 operations; 3442.16 current ops/sec; [INSERT > > AverageLatency(ms)=0.04] > > * 180 sec: 1209919 operations; 0 current ops/sec; * > > 190 sec: 1265127 operations; 5515.84 current ops/sec; [INSERT > > AverageLatency(ms)=10.25] > > 200 sec: 1382184 operations; 11705.7 current ops/sec; [INSERT > > AverageLatency(ms)=0.43] > > 210 sec: 1431400 operations; 4921.11 current ops/sec; [INSERT > > AverageLatency(ms)=2.39] > > * 220 sec: 1431400 operations; 0 current ops/sec; * > > 230 sec: 1490315 operations; 5869.2 current ops/sec; [INSERT > > AverageLatency(ms)=9.58] > > 240 sec: 1513430 operations; 2311.27 current ops/sec; [INSERT > > AverageLatency(ms)=0.29] > > 250 sec: 1599562 operations; 8613.2 current ops/sec; [INSERT > > AverageLatency(ms)=3.86] > > 260 sec: 1725739 operations; 12616.44 current ops/sec; [INSERT > > AverageLatency(ms)=1.35] > > 270 sec: 1823203 operations; 9746.4 current ops/sec; [INSERT > > AverageLatency(ms)=1.39] > > 280 sec: 1911486 operations; 8827.42 current ops/sec; [INSERT > > AverageLatency(ms)=1.25] > > 290 sec: 2000000 operations; 8850.51 current ops/sec; [INSERT > > AverageLatency(ms)=1.41] > > > > > > This is with 0.89 running on 5 region servers on a non-hdfs > append-enabled > > substrate. > > > > I am logging GC's on both master and regions and I don't see a > correlation > > between either. I am still verifying that, so it could be wrong, but I > am > > pretty sure not. What I do see is lots of small compactions that look > > roughly like this: > > > > {Heap before GC invocations=181 (full 23): > > par new generation total 19136K, used 17895K [0x00007f9aa40a0000, > > 0x00007f9aa5560000, 0x00007f9aabda0000) > > eden space 17024K, 99% used [0x00007f9aa40a0000, 0x00007f9aa513b058, > > 0x00007f9aa5140000) > > from space 2112K, 42% used [0x00007f9aa5350000, 0x00007f9aa542ef38, > > 0x00007f9aa5560000) > > to space 2112K, 0% used [0x00007f9aa5140000, 0x00007f9aa5140000, > > 0x00007f9aa5350000) > > concurrent mark-sweep generation total 252700K, used 83769K
-
Re: Cluster Wide PausesTed Dunning 2011-01-12, 16:24
This is a plausible theory.
But I was seeing very long pauses (> 10 seconds). Is it reasonable that a split takes that long? I will move to 0.90 today and gather additional time aligned numbers and events to determine if I can see any coincidence. On Wed, Jan 12, 2011 at 8:17 AM, Stack <[EMAIL PROTECTED]> wrote: > Client loading might be 'random' > spraying over lots of random regions but they all get stuck waiting on > one particular region to come back online. >
-
Re: Cluster Wide PausesWayne 2011-01-12, 16:40
We are using 0.89.20100924, r1001068
We are seeing see it during heavy write load (which is all the time), but yesterday we had read load as well as write load and saw both reads and writes stop for 10+ seconds. The region size is the biggest clue we have found from our tests as setting up a new cluster with a 1GB max region size and starting to load heavily we will see this a lot for long long time frames. Maybe the bigger file gets hung up more easily with a split? Your description below also fits in that early on the load is not balanced so it is easier to stop everything on one node as the balance is not great early on. I will file a JIRA. I will also try to dig deeper into the logs during the pauses to find a node that might be stuck in a split. On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: > On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: > > We have very frequent cluster wide pauses that stop all reads and writes > > for seconds. > > All reads and all writes? > > I've seen the pause too for writes. Its something I've always meant > to look into. Friso postulates one cause. Another that we've talked > of is a region taking a while to come back on line after a split or a > rebalance for whatever reason. Client loading might be 'random' > spraying over lots of random regions but they all get stuck waiting on > one particular region to come back online. > > I suppose reads could be blocked for same reason if all are trying to > read from the offlined region. > > What version of hbase are you using? Splits should be faster in 0.90 > now that the split daughters come up on the same region. > > Sorry I don't have a better answer for you. Need to dig in. > > File a JIRA. If you want to help out some, stick some data up in it. > Some suggestions would be to enable logging of when we lookup region > locations in client and then note when requests go to zero. Can you > figure what region the clients are waiting on (if they are waiting on > any). If you can pull out a particular one, try and elicit its > history at time of blockage. Is it being moved or mid-split? I > suppose it makes sense that bigger regions would make the situation > 'worse'. I can take a look at it too. > > St.Ack > > > > > We are constantly loading data to this cluster of 10 nodes. > > These pauses can happen as frequently as every minute but sometimes are > not > > seen for 15+ minutes. Basically watching the Region server list with > request > > counts is the only evidence of what is going on. All reads and writes > > totally stop and if there is ever any activity it is on the node hosting > the > > .META. table with a request count of region count + 1. This problem seems > to > > be worse with a larger region size. We tried a 1GB region size and saw > this > > more than we saw actual activity (and stopped using a larger region size > > because of it). We went back to the default region size and it was > better, > > but we had too many regions so now we are up to 512M for a region size > and > > we are seeing it more again. > > > > Does anyone know what this is? We have dug into all of the logs to find > some > > sort of pause but are not able to find anything. Is this an wal hlog > roll? > > Is this a region split or compaction? Of course our biggest fear is a GC > > pause on the master but we do not have java logging turned on with the > > master to tell. What could possibly stop the entire cluster from working > for > > seconds at a time very frequently? > > > > Thanks in advance for any ideas of what could be causing this. > > >
-
Re: Cluster Wide PausesWayne 2011-01-12, 17:03
Added: https://issues.apache.org/jira/browse/HBASE-3438.
On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: > We are using 0.89.20100924, r1001068 > > We are seeing see it during heavy write load (which is all the time), but > yesterday we had read load as well as write load and saw both reads and > writes stop for 10+ seconds. The region size is the biggest clue we have > found from our tests as setting up a new cluster with a 1GB max region size > and starting to load heavily we will see this a lot for long long time > frames. Maybe the bigger file gets hung up more easily with a split? Your > description below also fits in that early on the load is not balanced so it > is easier to stop everything on one node as the balance is not great early > on. I will file a JIRA. I will also try to dig deeper into the logs during > the pauses to find a node that might be stuck in a split. > > > > On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: > >> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: >> > We have very frequent cluster wide pauses that stop all reads and >> writes >> > for seconds. >> >> All reads and all writes? >> >> I've seen the pause too for writes. Its something I've always meant >> to look into. Friso postulates one cause. Another that we've talked >> of is a region taking a while to come back on line after a split or a >> rebalance for whatever reason. Client loading might be 'random' >> spraying over lots of random regions but they all get stuck waiting on >> one particular region to come back online. >> >> I suppose reads could be blocked for same reason if all are trying to >> read from the offlined region. >> >> What version of hbase are you using? Splits should be faster in 0.90 >> now that the split daughters come up on the same region. >> >> Sorry I don't have a better answer for you. Need to dig in. >> >> File a JIRA. If you want to help out some, stick some data up in it. >> Some suggestions would be to enable logging of when we lookup region >> locations in client and then note when requests go to zero. Can you >> figure what region the clients are waiting on (if they are waiting on >> any). If you can pull out a particular one, try and elicit its >> history at time of blockage. Is it being moved or mid-split? I >> suppose it makes sense that bigger regions would make the situation >> 'worse'. I can take a look at it too. >> >> St.Ack >> >> >> >> >> We are constantly loading data to this cluster of 10 nodes. >> > These pauses can happen as frequently as every minute but sometimes are >> not >> > seen for 15+ minutes. Basically watching the Region server list with >> request >> > counts is the only evidence of what is going on. All reads and writes >> > totally stop and if there is ever any activity it is on the node hosting >> the >> > .META. table with a request count of region count + 1. This problem >> seems to >> > be worse with a larger region size. We tried a 1GB region size and saw >> this >> > more than we saw actual activity (and stopped using a larger region size >> > because of it). We went back to the default region size and it was >> better, >> > but we had too many regions so now we are up to 512M for a region size >> and >> > we are seeing it more again. >> > >> > Does anyone know what this is? We have dug into all of the logs to find >> some >> > sort of pause but are not able to find anything. Is this an wal hlog >> roll? >> > Is this a region split or compaction? Of course our biggest fear is a GC >> > pause on the master but we do not have java logging turned on with the >> > master to tell. What could possibly stop the entire cluster from working >> for >> > seconds at a time very frequently? >> > >> > Thanks in advance for any ideas of what could be causing this. >> > >> > >
-
Re: Cluster Wide PausesStack 2011-01-12, 18:52
Jon asks that you describe your loading in the issue. Would you mind
doing so. Ted, stick up in the issue the workload and configs. you are running if you don't mind. I'd like to try it over here. Thanks lads, St.Ack On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> wrote: > Added: https://issues.apache.org/jira/browse/HBASE-3438. > > On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: > >> We are using 0.89.20100924, r1001068 >> >> We are seeing see it during heavy write load (which is all the time), but >> yesterday we had read load as well as write load and saw both reads and >> writes stop for 10+ seconds. The region size is the biggest clue we have >> found from our tests as setting up a new cluster with a 1GB max region size >> and starting to load heavily we will see this a lot for long long time >> frames. Maybe the bigger file gets hung up more easily with a split? Your >> description below also fits in that early on the load is not balanced so it >> is easier to stop everything on one node as the balance is not great early >> on. I will file a JIRA. I will also try to dig deeper into the logs during >> the pauses to find a node that might be stuck in a split. >> >> >> >> On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: >> >>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: >>> > We have very frequent cluster wide pauses that stop all reads and >>> writes >>> > for seconds. >>> >>> All reads and all writes? >>> >>> I've seen the pause too for writes. Its something I've always meant >>> to look into. Friso postulates one cause. Another that we've talked >>> of is a region taking a while to come back on line after a split or a >>> rebalance for whatever reason. Client loading might be 'random' >>> spraying over lots of random regions but they all get stuck waiting on >>> one particular region to come back online. >>> >>> I suppose reads could be blocked for same reason if all are trying to >>> read from the offlined region. >>> >>> What version of hbase are you using? Splits should be faster in 0.90 >>> now that the split daughters come up on the same region. >>> >>> Sorry I don't have a better answer for you. Need to dig in. >>> >>> File a JIRA. If you want to help out some, stick some data up in it. >>> Some suggestions would be to enable logging of when we lookup region >>> locations in client and then note when requests go to zero. Can you >>> figure what region the clients are waiting on (if they are waiting on >>> any). If you can pull out a particular one, try and elicit its >>> history at time of blockage. Is it being moved or mid-split? I >>> suppose it makes sense that bigger regions would make the situation >>> 'worse'. I can take a look at it too. >>> >>> St.Ack >>> >>> >>> >>> >>> We are constantly loading data to this cluster of 10 nodes. >>> > These pauses can happen as frequently as every minute but sometimes are >>> not >>> > seen for 15+ minutes. Basically watching the Region server list with >>> request >>> > counts is the only evidence of what is going on. All reads and writes >>> > totally stop and if there is ever any activity it is on the node hosting >>> the >>> > .META. table with a request count of region count + 1. This problem >>> seems to >>> > be worse with a larger region size. We tried a 1GB region size and saw >>> this >>> > more than we saw actual activity (and stopped using a larger region size >>> > because of it). We went back to the default region size and it was >>> better, >>> > but we had too many regions so now we are up to 512M for a region size >>> and >>> > we are seeing it more again. >>> > >>> > Does anyone know what this is? We have dug into all of the logs to find >>> some >>> > sort of pause but are not able to find anything. Is this an wal hlog >>> roll? >>> > Is this a region split or compaction? Of course our biggest fear is a GC >>> > pause on the master but we do not have java logging turned on with the
-
Re: Cluster Wide PausesWayne 2011-01-12, 19:52
We are seeing some TCP Resets on all nodes at the same time, and sometimes
quite a lot of them. We have yet to correlate the pauses to the TCP resets but I am starting to wonder if this is partly a network problem. Does Gigabit Ethernet break down on high volume nodes? Do high volume nodes use 10G or Infiniband? On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: > Jon asks that you describe your loading in the issue. Would you mind > doing so. Ted, stick up in the issue the workload and configs. you > are running if you don't mind. I'd like to try it over here. > Thanks lads, > St.Ack > > > On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> wrote: > > Added: https://issues.apache.org/jira/browse/HBASE-3438. > > > > On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > >> We are using 0.89.20100924, r1001068 > >> > >> We are seeing see it during heavy write load (which is all the time), > but > >> yesterday we had read load as well as write load and saw both reads and > >> writes stop for 10+ seconds. The region size is the biggest clue we have > >> found from our tests as setting up a new cluster with a 1GB max region > size > >> and starting to load heavily we will see this a lot for long long time > >> frames. Maybe the bigger file gets hung up more easily with a split? > Your > >> description below also fits in that early on the load is not balanced so > it > >> is easier to stop everything on one node as the balance is not great > early > >> on. I will file a JIRA. I will also try to dig deeper into the logs > during > >> the pauses to find a node that might be stuck in a split. > >> > >> > >> > >> On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: > >> > >>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: > >>> > We have very frequent cluster wide pauses that stop all reads and > >>> writes > >>> > for seconds. > >>> > >>> All reads and all writes? > >>> > >>> I've seen the pause too for writes. Its something I've always meant > >>> to look into. Friso postulates one cause. Another that we've talked > >>> of is a region taking a while to come back on line after a split or a > >>> rebalance for whatever reason. Client loading might be 'random' > >>> spraying over lots of random regions but they all get stuck waiting on > >>> one particular region to come back online. > >>> > >>> I suppose reads could be blocked for same reason if all are trying to > >>> read from the offlined region. > >>> > >>> What version of hbase are you using? Splits should be faster in 0.90 > >>> now that the split daughters come up on the same region. > >>> > >>> Sorry I don't have a better answer for you. Need to dig in. > >>> > >>> File a JIRA. If you want to help out some, stick some data up in it. > >>> Some suggestions would be to enable logging of when we lookup region > >>> locations in client and then note when requests go to zero. Can you > >>> figure what region the clients are waiting on (if they are waiting on > >>> any). If you can pull out a particular one, try and elicit its > >>> history at time of blockage. Is it being moved or mid-split? I > >>> suppose it makes sense that bigger regions would make the situation > >>> 'worse'. I can take a look at it too. > >>> > >>> St.Ack > >>> > >>> > >>> > >>> > >>> We are constantly loading data to this cluster of 10 nodes. > >>> > These pauses can happen as frequently as every minute but sometimes > are > >>> not > >>> > seen for 15+ minutes. Basically watching the Region server list with > >>> request > >>> > counts is the only evidence of what is going on. All reads and writes > >>> > totally stop and if there is ever any activity it is on the node > hosting > >>> the > >>> > .META. table with a request count of region count + 1. This problem > >>> seems to > >>> > be worse with a larger region size. We tried a 1GB region size and > saw > >>> this > >>> > more than we saw actual activity (and stopped using a larger region
-
Re: Cluster Wide PausesStack 2011-01-12, 19:57
On Wed, Jan 12, 2011 at 11:52 AM, Wayne <[EMAIL PROTECTED]> wrote:
> We are seeing some TCP Resets on all nodes at the same time, and sometimes > quite a lot of them. We have yet to correlate the pauses to the TCP resets That would be an interesting data point if indeed its the case. St.Ack
-
Re: Cluster Wide PausesTed Dunning 2011-01-12, 21:29
I will do that as soon as I have the 0.90 data.
On Wed, Jan 12, 2011 at 10:52 AM, Stack <[EMAIL PROTECTED]> wrote: > Jon asks that you describe your loading in the issue. Would you mind > doing so. Ted, stick up in the issue the workload and configs. you > are running if you don't mind. I'd like to try it over here. > Thanks lads, > St.Ack > > > On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> wrote: > > Added: https://issues.apache.org/jira/browse/HBASE-3438. > > > > On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > >> We are using 0.89.20100924, r1001068 > >> > >> We are seeing see it during heavy write load (which is all the time), > but > >> yesterday we had read load as well as write load and saw both reads and > >> writes stop for 10+ seconds. The region size is the biggest clue we have > >> found from our tests as setting up a new cluster with a 1GB max region > size > >> and starting to load heavily we will see this a lot for long long time > >> frames. Maybe the bigger file gets hung up more easily with a split? > Your > >> description below also fits in that early on the load is not balanced so > it > >> is easier to stop everything on one node as the balance is not great > early > >> on. I will file a JIRA. I will also try to dig deeper into the logs > during > >> the pauses to find a node that might be stuck in a split. > >> > >> > >> > >> On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: > >> > >>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: > >>> > We have very frequent cluster wide pauses that stop all reads and > >>> writes > >>> > for seconds. > >>> > >>> All reads and all writes? > >>> > >>> I've seen the pause too for writes. Its something I've always meant > >>> to look into. Friso postulates one cause. Another that we've talked > >>> of is a region taking a while to come back on line after a split or a > >>> rebalance for whatever reason. Client loading might be 'random' > >>> spraying over lots of random regions but they all get stuck waiting on > >>> one particular region to come back online. > >>> > >>> I suppose reads could be blocked for same reason if all are trying to > >>> read from the offlined region. > >>> > >>> What version of hbase are you using? Splits should be faster in 0.90 > >>> now that the split daughters come up on the same region. > >>> > >>> Sorry I don't have a better answer for you. Need to dig in. > >>> > >>> File a JIRA. If you want to help out some, stick some data up in it. > >>> Some suggestions would be to enable logging of when we lookup region > >>> locations in client and then note when requests go to zero. Can you > >>> figure what region the clients are waiting on (if they are waiting on > >>> any). If you can pull out a particular one, try and elicit its > >>> history at time of blockage. Is it being moved or mid-split? I > >>> suppose it makes sense that bigger regions would make the situation > >>> 'worse'. I can take a look at it too. > >>> > >>> St.Ack > >>> > >>> > >>> > >>> > >>> We are constantly loading data to this cluster of 10 nodes. > >>> > These pauses can happen as frequently as every minute but sometimes > are > >>> not > >>> > seen for 15+ minutes. Basically watching the Region server list with > >>> request > >>> > counts is the only evidence of what is going on. All reads and writes > >>> > totally stop and if there is ever any activity it is on the node > hosting > >>> the > >>> > .META. table with a request count of region count + 1. This problem > >>> seems to > >>> > be worse with a larger region size. We tried a 1GB region size and > saw > >>> this > >>> > more than we saw actual activity (and stopped using a larger region > size > >>> > because of it). We went back to the default region size and it was > >>> better, > >>> > but we had too many regions so now we are up to 512M for a region > size > >>> and > >>> > we are seeing it more again. > >>> > > >>> > Does anyone know what this is? We have dug into all of the logs to
-
Re: Cluster Wide PausesTatsuya Kawano 2011-01-13, 11:59
Hi Wayne, > We are seeing some TCP Resets on all nodes at the same time, and sometimes > quite a lot of them. Have you checked this article from Andrei and Cosmin? They had a busy firewall to cause network blackout. http://hstack.org/hbase-performance-testing/ Maybe it's not your case but just for sure. Thanks, -- Tatsuya Kawano (Mr.) Tokyo, Japan On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: > We are seeing some TCP Resets on all nodes at the same time, and sometimes > quite a lot of them. We have yet to correlate the pauses to the TCP resets > but I am starting to wonder if this is partly a network problem. Does > Gigabit Ethernet break down on high volume nodes? Do high volume nodes use > 10G or Infiniband? > > > On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: > >> Jon asks that you describe your loading in the issue. Would you mind >> doing so. Ted, stick up in the issue the workload and configs. you >> are running if you don't mind. I'd like to try it over here. >> Thanks lads, >> St.Ack >> >> >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> wrote: >>> Added: https://issues.apache.org/jira/browse/HBASE-3438. >>> >>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: >>> >>>> We are using 0.89.20100924, r1001068 >>>> >>>> We are seeing see it during heavy write load (which is all the time), >> but >>>> yesterday we had read load as well as write load and saw both reads and >>>> writes stop for 10+ seconds. The region size is the biggest clue we have >>>> found from our tests as setting up a new cluster with a 1GB max region >> size >>>> and starting to load heavily we will see this a lot for long long time >>>> frames. Maybe the bigger file gets hung up more easily with a split? >> Your >>>> description below also fits in that early on the load is not balanced so >> it >>>> is easier to stop everything on one node as the balance is not great >> early >>>> on. I will file a JIRA. I will also try to dig deeper into the logs >> during >>>> the pauses to find a node that might be stuck in a split. >>>> >>>> >>>> >>>> On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: >>>> >>>>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: >>>>>> We have very frequent cluster wide pauses that stop all reads and >>>>> writes >>>>>> for seconds. >>>>> >>>>> All reads and all writes? >>>>> >>>>> I've seen the pause too for writes. Its something I've always meant >>>>> to look into. Friso postulates one cause. Another that we've talked >>>>> of is a region taking a while to come back on line after a split or a >>>>> rebalance for whatever reason. Client loading might be 'random' >>>>> spraying over lots of random regions but they all get stuck waiting on >>>>> one particular region to come back online. >>>>> >>>>> I suppose reads could be blocked for same reason if all are trying to >>>>> read from the offlined region. >>>>> >>>>> What version of hbase are you using? Splits should be faster in 0.90 >>>>> now that the split daughters come up on the same region. >>>>> >>>>> Sorry I don't have a better answer for you. Need to dig in. >>>>> >>>>> File a JIRA. If you want to help out some, stick some data up in it. >>>>> Some suggestions would be to enable logging of when we lookup region >>>>> locations in client and then note when requests go to zero. Can you >>>>> figure what region the clients are waiting on (if they are waiting on >>>>> any). If you can pull out a particular one, try and elicit its >>>>> history at time of blockage. Is it being moved or mid-split? I >>>>> suppose it makes sense that bigger regions would make the situation >>>>> 'worse'. I can take a look at it too. >>>>> >>>>> St.Ack >>>>> >>>>> >>>>> >>>>> >>>>> We are constantly loading data to this cluster of 10 nodes. >>>>>> These pauses can happen as frequently as every minute but sometimes >> are >>>>> not >>>>>> seen for 15+ minutes. Basically watching the Region server list with
-
Re: Cluster Wide PausesWayne 2011-01-13, 12:49
Thank you for the lead! We will definitely look closer at the OS logs.
On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano <[EMAIL PROTECTED]>wrote: > > Hi Wayne, > > > We are seeing some TCP Resets on all nodes at the same time, and > sometimes > > quite a lot of them. > > > Have you checked this article from Andrei and Cosmin? They had a busy > firewall to cause network blackout. > > http://hstack.org/hbase-performance-testing/ > > Maybe it's not your case but just for sure. > > Thanks, > > -- > Tatsuya Kawano (Mr.) > Tokyo, Japan > > > On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > We are seeing some TCP Resets on all nodes at the same time, and > sometimes > > quite a lot of them. We have yet to correlate the pauses to the TCP > resets > > but I am starting to wonder if this is partly a network problem. Does > > Gigabit Ethernet break down on high volume nodes? Do high volume nodes > use > > 10G or Infiniband? > > > > > > On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: > > > >> Jon asks that you describe your loading in the issue. Would you mind > >> doing so. Ted, stick up in the issue the workload and configs. you > >> are running if you don't mind. I'd like to try it over here. > >> Thanks lads, > >> St.Ack > >> > >> > >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> wrote: > >>> Added: https://issues.apache.org/jira/browse/HBASE-3438. > >>> > >>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: > >>> > >>>> We are using 0.89.20100924, r1001068 > >>>> > >>>> We are seeing see it during heavy write load (which is all the time), > >> but > >>>> yesterday we had read load as well as write load and saw both reads > and > >>>> writes stop for 10+ seconds. The region size is the biggest clue we > have > >>>> found from our tests as setting up a new cluster with a 1GB max region > >> size > >>>> and starting to load heavily we will see this a lot for long long time > >>>> frames. Maybe the bigger file gets hung up more easily with a split? > >> Your > >>>> description below also fits in that early on the load is not balanced > so > >> it > >>>> is easier to stop everything on one node as the balance is not great > >> early > >>>> on. I will file a JIRA. I will also try to dig deeper into the logs > >> during > >>>> the pauses to find a node that might be stuck in a split. > >>>> > >>>> > >>>> > >>>> On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: > >>>> > >>>>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: > >>>>>> We have very frequent cluster wide pauses that stop all reads and > >>>>> writes > >>>>>> for seconds. > >>>>> > >>>>> All reads and all writes? > >>>>> > >>>>> I've seen the pause too for writes. Its something I've always meant > >>>>> to look into. Friso postulates one cause. Another that we've talked > >>>>> of is a region taking a while to come back on line after a split or a > >>>>> rebalance for whatever reason. Client loading might be 'random' > >>>>> spraying over lots of random regions but they all get stuck waiting > on > >>>>> one particular region to come back online. > >>>>> > >>>>> I suppose reads could be blocked for same reason if all are trying to > >>>>> read from the offlined region. > >>>>> > >>>>> What version of hbase are you using? Splits should be faster in 0.90 > >>>>> now that the split daughters come up on the same region. > >>>>> > >>>>> Sorry I don't have a better answer for you. Need to dig in. > >>>>> > >>>>> File a JIRA. If you want to help out some, stick some data up in it. > >>>>> Some suggestions would be to enable logging of when we lookup region > >>>>> locations in client and then note when requests go to zero. Can you > >>>>> figure what region the clients are waiting on (if they are waiting on > >>>>> any). If you can pull out a particular one, try and elicit its > >>>>> history at time of blockage. Is it being moved or mid-split? I > >>>>> suppose it makes sense that bigger regions would make the situation
-
Re: Cluster Wide PausesWayne 2011-01-14, 14:34
We have not found any smoking gun here. Most likely these are region splits
on a quickly growing/hot region that all clients get caught waiting for. On Thu, Jan 13, 2011 at 7:49 AM, Wayne <[EMAIL PROTECTED]> wrote: > Thank you for the lead! We will definitely look closer at the OS logs. > > > On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano <[EMAIL PROTECTED]>wrote: > >> >> Hi Wayne, >> >> > We are seeing some TCP Resets on all nodes at the same time, and >> sometimes >> > quite a lot of them. >> >> >> Have you checked this article from Andrei and Cosmin? They had a busy >> firewall to cause network blackout. >> >> http://hstack.org/hbase-performance-testing/ >> >> Maybe it's not your case but just for sure. >> >> Thanks, >> >> -- >> Tatsuya Kawano (Mr.) >> Tokyo, Japan >> >> >> On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: >> >> > We are seeing some TCP Resets on all nodes at the same time, and >> sometimes >> > quite a lot of them. We have yet to correlate the pauses to the TCP >> resets >> > but I am starting to wonder if this is partly a network problem. Does >> > Gigabit Ethernet break down on high volume nodes? Do high volume nodes >> use >> > 10G or Infiniband? >> > >> > >> > On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: >> > >> >> Jon asks that you describe your loading in the issue. Would you mind >> >> doing so. Ted, stick up in the issue the workload and configs. you >> >> are running if you don't mind. I'd like to try it over here. >> >> Thanks lads, >> >> St.Ack >> >> >> >> >> >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> wrote: >> >>> Added: https://issues.apache.org/jira/browse/HBASE-3438. >> >>> >> >>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: >> >>> >> >>>> We are using 0.89.20100924, r1001068 >> >>>> >> >>>> We are seeing see it during heavy write load (which is all the time), >> >> but >> >>>> yesterday we had read load as well as write load and saw both reads >> and >> >>>> writes stop for 10+ seconds. The region size is the biggest clue we >> have >> >>>> found from our tests as setting up a new cluster with a 1GB max >> region >> >> size >> >>>> and starting to load heavily we will see this a lot for long long >> time >> >>>> frames. Maybe the bigger file gets hung up more easily with a split? >> >> Your >> >>>> description below also fits in that early on the load is not balanced >> so >> >> it >> >>>> is easier to stop everything on one node as the balance is not great >> >> early >> >>>> on. I will file a JIRA. I will also try to dig deeper into the logs >> >> during >> >>>> the pauses to find a node that might be stuck in a split. >> >>>> >> >>>> >> >>>> >> >>>> On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: >> >>>> >> >>>>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: >> >>>>>> We have very frequent cluster wide pauses that stop all reads and >> >>>>> writes >> >>>>>> for seconds. >> >>>>> >> >>>>> All reads and all writes? >> >>>>> >> >>>>> I've seen the pause too for writes. Its something I've always meant >> >>>>> to look into. Friso postulates one cause. Another that we've >> talked >> >>>>> of is a region taking a while to come back on line after a split or >> a >> >>>>> rebalance for whatever reason. Client loading might be 'random' >> >>>>> spraying over lots of random regions but they all get stuck waiting >> on >> >>>>> one particular region to come back online. >> >>>>> >> >>>>> I suppose reads could be blocked for same reason if all are trying >> to >> >>>>> read from the offlined region. >> >>>>> >> >>>>> What version of hbase are you using? Splits should be faster in >> 0.90 >> >>>>> now that the split daughters come up on the same region. >> >>>>> >> >>>>> Sorry I don't have a better answer for you. Need to dig in. >> >>>>> >> >>>>> File a JIRA. If you want to help out some, stick some data up in >> it. >> >>>>> Some suggestions would be to enable logging of when we lookup region
-
Re: Cluster Wide PausesChristopher Tarnas 2011-01-14, 15:28
I have been seeing similar problems and found by raising the
hbase.hregion.memstore.block.multiplier to above 12 (default is two) and the hbase.hstore.blockingStoreFiles to 16 I managed to reduce the frequency of the pauses during loads. My nodes are pretty beefy (48 GB of ram) so I had room to experiment. >From what I understand that gave the regionservers more buffer before they had to halt the world to catch up. The pauses still happen but their impact is less now. -chris On Fri, Jan 14, 2011 at 8:34 AM, Wayne <[EMAIL PROTECTED]> wrote: > We have not found any smoking gun here. Most likely these are region splits > on a quickly growing/hot region that all clients get caught waiting for. > > > On Thu, Jan 13, 2011 at 7:49 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > Thank you for the lead! We will definitely look closer at the OS logs. > > > > > > On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano <[EMAIL PROTECTED] > >wrote: > > > >> > >> Hi Wayne, > >> > >> > We are seeing some TCP Resets on all nodes at the same time, and > >> sometimes > >> > quite a lot of them. > >> > >> > >> Have you checked this article from Andrei and Cosmin? They had a busy > >> firewall to cause network blackout. > >> > >> http://hstack.org/hbase-performance-testing/ > >> > >> Maybe it's not your case but just for sure. > >> > >> Thanks, > >> > >> -- > >> Tatsuya Kawano (Mr.) > >> Tokyo, Japan > >> > >> > >> On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: > >> > >> > We are seeing some TCP Resets on all nodes at the same time, and > >> sometimes > >> > quite a lot of them. We have yet to correlate the pauses to the TCP > >> resets > >> > but I am starting to wonder if this is partly a network problem. Does > >> > Gigabit Ethernet break down on high volume nodes? Do high volume nodes > >> use > >> > 10G or Infiniband? > >> > > >> > > >> > On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: > >> > > >> >> Jon asks that you describe your loading in the issue. Would you mind > >> >> doing so. Ted, stick up in the issue the workload and configs. you > >> >> are running if you don't mind. I'd like to try it over here. > >> >> Thanks lads, > >> >> St.Ack > >> >> > >> >> > >> >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> wrote: > >> >>> Added: https://issues.apache.org/jira/browse/HBASE-3438. > >> >>> > >> >>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> wrote: > >> >>> > >> >>>> We are using 0.89.20100924, r1001068 > >> >>>> > >> >>>> We are seeing see it during heavy write load (which is all the > time), > >> >> but > >> >>>> yesterday we had read load as well as write load and saw both reads > >> and > >> >>>> writes stop for 10+ seconds. The region size is the biggest clue we > >> have > >> >>>> found from our tests as setting up a new cluster with a 1GB max > >> region > >> >> size > >> >>>> and starting to load heavily we will see this a lot for long long > >> time > >> >>>> frames. Maybe the bigger file gets hung up more easily with a > split? > >> >> Your > >> >>>> description below also fits in that early on the load is not > balanced > >> so > >> >> it > >> >>>> is easier to stop everything on one node as the balance is not > great > >> >> early > >> >>>> on. I will file a JIRA. I will also try to dig deeper into the logs > >> >> during > >> >>>> the pauses to find a node that might be stuck in a split. > >> >>>> > >> >>>> > >> >>>> > >> >>>> On Wed, Jan 12, 2011 at 11:17 AM, Stack <[EMAIL PROTECTED]> wrote: > >> >>>> > >> >>>>> On Tue, Jan 11, 2011 at 2:34 PM, Wayne <[EMAIL PROTECTED]> wrote: > >> >>>>>> We have very frequent cluster wide pauses that stop all reads and > >> >>>>> writes > >> >>>>>> for seconds. > >> >>>>> > >> >>>>> All reads and all writes? > >> >>>>> > >> >>>>> I've seen the pause too for writes. Its something I've always > meant > >> >>>>> to look into. Friso postulates one cause. Another that we've > >> talked > >> >>>>> of is a region taking a while to come back on line after a split
-
RE: Cluster Wide PausesJonathan Gray 2011-01-14, 17:29
These are a different kind of pause (those caused by blockingStoreFiles).
This is HBase stepping in and actually blocking updates to a region because compactions have not been able to keep up with the write load. It could manifest itself in the same way but this is different than shorter pauses caused by periodic offlining of regions during balancing and splits. Wayne, have you confirmed in your RegionServer logs that the pauses are associated with splits or region movement, and that you are not seeing the blocking store files issue? JG > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher > Tarnas > Sent: Friday, January 14, 2011 7:29 AM > To: [EMAIL PROTECTED] > Subject: Re: Cluster Wide Pauses > > I have been seeing similar problems and found by raising the > hbase.hregion.memstore.block.multiplier > to above 12 (default is two) and the hbase.hstore.blockingStoreFiles to 16 I > managed to reduce the frequency of the pauses during loads. My nodes are > pretty beefy (48 GB of ram) so I had room to experiment. > > From what I understand that gave the regionservers more buffer before > they had to halt the world to catch up. The pauses still happen but their > impact is less now. > > -chris > > On Fri, Jan 14, 2011 at 8:34 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > We have not found any smoking gun here. Most likely these are region > > splits on a quickly growing/hot region that all clients get caught waiting for. > > > > > > On Thu, Jan 13, 2011 at 7:49 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > > > Thank you for the lead! We will definitely look closer at the OS logs. > > > > > > > > > On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano > > ><[EMAIL PROTECTED] > > >wrote: > > > > > >> > > >> Hi Wayne, > > >> > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > >> sometimes > > >> > quite a lot of them. > > >> > > >> > > >> Have you checked this article from Andrei and Cosmin? They had a > > >> busy firewall to cause network blackout. > > >> > > >> http://hstack.org/hbase-performance-testing/ > > >> > > >> Maybe it's not your case but just for sure. > > >> > > >> Thanks, > > >> > > >> -- > > >> Tatsuya Kawano (Mr.) > > >> Tokyo, Japan > > >> > > >> > > >> On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: > > >> > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > >> sometimes > > >> > quite a lot of them. We have yet to correlate the pauses to the > > >> > TCP > > >> resets > > >> > but I am starting to wonder if this is partly a network problem. > > >> > Does Gigabit Ethernet break down on high volume nodes? Do high > > >> > volume nodes > > >> use > > >> > 10G or Infiniband? > > >> > > > >> > > > >> > On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: > > >> > > > >> >> Jon asks that you describe your loading in the issue. Would you > > >> >> mind doing so. Ted, stick up in the issue the workload and > > >> >> configs. you are running if you don't mind. I'd like to try it over here. > > >> >> Thanks lads, > > >> >> St.Ack > > >> >> > > >> >> > > >> >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> > wrote: > > >> >>> Added: https://issues.apache.org/jira/browse/HBASE-3438. > > >> >>> > > >> >>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> > wrote: > > >> >>> > > >> >>>> We are using 0.89.20100924, r1001068 > > >> >>>> > > >> >>>> We are seeing see it during heavy write load (which is all the > > time), > > >> >> but > > >> >>>> yesterday we had read load as well as write load and saw both > > >> >>>> reads > > >> and > > >> >>>> writes stop for 10+ seconds. The region size is the biggest > > >> >>>> clue we > > >> have > > >> >>>> found from our tests as setting up a new cluster with a 1GB > > >> >>>> max > > >> region > > >> >> size > > >> >>>> and starting to load heavily we will see this a lot for long > > >> >>>> long > > >> time > > >> >>>> frames. Maybe the bigger file gets hung up more easily with a
-
Re: Cluster Wide PausesChristopher Tarnas 2011-01-14, 17:53
Thanks - I was not sure and had not received a response from the list on my
related question earlier this week. It does seem like compactions are related to my problem, and if I understand correctly does raising hbase.hregion.memstore.block.multiplier give it more of a buffer for that before writes are blocked while compactions happen? I'm writing via thrift (about 30 clients) to a 5 node cluster when I see this problem. There is no io wait so I don't think it is disk bound, and it is not CPU starved. I'm waiting on IT to get me access to ganglia for the network info. -chris On Fri, Jan 14, 2011 at 11:29 AM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > These are a different kind of pause (those caused by blockingStoreFiles). > > This is HBase stepping in and actually blocking updates to a region because > compactions have not been able to keep up with the write load. It could > manifest itself in the same way but this is different than shorter pauses > caused by periodic offlining of regions during balancing and splits. > > Wayne, have you confirmed in your RegionServer logs that the pauses are > associated with splits or region movement, and that you are not seeing the > blocking store files issue? > > JG > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher > > Tarnas > > Sent: Friday, January 14, 2011 7:29 AM > > To: [EMAIL PROTECTED] > > Subject: Re: Cluster Wide Pauses > > > > I have been seeing similar problems and found by raising the > > hbase.hregion.memstore.block.multiplier > > to above 12 (default is two) and the hbase.hstore.blockingStoreFiles to > 16 I > > managed to reduce the frequency of the pauses during loads. My nodes are > > pretty beefy (48 GB of ram) so I had room to experiment. > > > > From what I understand that gave the regionservers more buffer before > > they had to halt the world to catch up. The pauses still happen but their > > impact is less now. > > > > -chris > > > > On Fri, Jan 14, 2011 at 8:34 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > > > We have not found any smoking gun here. Most likely these are region > > > splits on a quickly growing/hot region that all clients get caught > waiting for. > > > > > > > > > On Thu, Jan 13, 2011 at 7:49 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > > > > > Thank you for the lead! We will definitely look closer at the OS > logs. > > > > > > > > > > > > On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano > > > ><[EMAIL PROTECTED] > > > >wrote: > > > > > > > >> > > > >> Hi Wayne, > > > >> > > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > > >> sometimes > > > >> > quite a lot of them. > > > >> > > > >> > > > >> Have you checked this article from Andrei and Cosmin? They had a > > > >> busy firewall to cause network blackout. > > > >> > > > >> http://hstack.org/hbase-performance-testing/ > > > >> > > > >> Maybe it's not your case but just for sure. > > > >> > > > >> Thanks, > > > >> > > > >> -- > > > >> Tatsuya Kawano (Mr.) > > > >> Tokyo, Japan > > > >> > > > >> > > > >> On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > >> > > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > > >> sometimes > > > >> > quite a lot of them. We have yet to correlate the pauses to the > > > >> > TCP > > > >> resets > > > >> > but I am starting to wonder if this is partly a network problem. > > > >> > Does Gigabit Ethernet break down on high volume nodes? Do high > > > >> > volume nodes > > > >> use > > > >> > 10G or Infiniband? > > > >> > > > > >> > > > > >> > On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: > > > >> > > > > >> >> Jon asks that you describe your loading in the issue. Would you > > > >> >> mind doing so. Ted, stick up in the issue the workload and > > > >> >> configs. you are running if you don't mind. I'd like to try it > over here. > > > >> >> Thanks lads, > > > >> >> St.Ack > > > >> >> > > > >> >> > > > >> >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]>
-
Re: Cluster Wide PausesWayne 2011-01-14, 18:02
I have not confirmed anything for sure, other than it does not cross tables
(does not happen when 2 tables are being written to). The region server logs are filled with compactions, splits, and memstore flushes so during a pause everything looks like any other time to me. On Fri, Jan 14, 2011 at 12:29 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > These are a different kind of pause (those caused by blockingStoreFiles). > > This is HBase stepping in and actually blocking updates to a region because > compactions have not been able to keep up with the write load. It could > manifest itself in the same way but this is different than shorter pauses > caused by periodic offlining of regions during balancing and splits. > > Wayne, have you confirmed in your RegionServer logs that the pauses are > associated with splits or region movement, and that you are not seeing the > blocking store files issue? > > JG > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher > > Tarnas > > Sent: Friday, January 14, 2011 7:29 AM > > To: [EMAIL PROTECTED] > > Subject: Re: Cluster Wide Pauses > > > > I have been seeing similar problems and found by raising the > > hbase.hregion.memstore.block.multiplier > > to above 12 (default is two) and the hbase.hstore.blockingStoreFiles to > 16 I > > managed to reduce the frequency of the pauses during loads. My nodes are > > pretty beefy (48 GB of ram) so I had room to experiment. > > > > From what I understand that gave the regionservers more buffer before > > they had to halt the world to catch up. The pauses still happen but their > > impact is less now. > > > > -chris > > > > On Fri, Jan 14, 2011 at 8:34 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > > > We have not found any smoking gun here. Most likely these are region > > > splits on a quickly growing/hot region that all clients get caught > waiting for. > > > > > > > > > On Thu, Jan 13, 2011 at 7:49 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > > > > > Thank you for the lead! We will definitely look closer at the OS > logs. > > > > > > > > > > > > On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano > > > ><[EMAIL PROTECTED] > > > >wrote: > > > > > > > >> > > > >> Hi Wayne, > > > >> > > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > > >> sometimes > > > >> > quite a lot of them. > > > >> > > > >> > > > >> Have you checked this article from Andrei and Cosmin? They had a > > > >> busy firewall to cause network blackout. > > > >> > > > >> http://hstack.org/hbase-performance-testing/ > > > >> > > > >> Maybe it's not your case but just for sure. > > > >> > > > >> Thanks, > > > >> > > > >> -- > > > >> Tatsuya Kawano (Mr.) > > > >> Tokyo, Japan > > > >> > > > >> > > > >> On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: > > > >> > > > >> > We are seeing some TCP Resets on all nodes at the same time, and > > > >> sometimes > > > >> > quite a lot of them. We have yet to correlate the pauses to the > > > >> > TCP > > > >> resets > > > >> > but I am starting to wonder if this is partly a network problem. > > > >> > Does Gigabit Ethernet break down on high volume nodes? Do high > > > >> > volume nodes > > > >> use > > > >> > 10G or Infiniband? > > > >> > > > > >> > > > > >> > On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: > > > >> > > > > >> >> Jon asks that you describe your loading in the issue. Would you > > > >> >> mind doing so. Ted, stick up in the issue the workload and > > > >> >> configs. you are running if you don't mind. I'd like to try it > over here. > > > >> >> Thanks lads, > > > >> >> St.Ack > > > >> >> > > > >> >> > > > >> >> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> > > wrote: > > > >> >>> Added: https://issues.apache.org/jira/browse/HBASE-3438. > > > >> >>> > > > >> >>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> > > wrote: > > > >> >>> > > > >> >>>> We are using 0.89.20100924, r1001068 > > > >> >>>> > > > >> >
-
Re: Cluster Wide PausesStack 2011-01-14, 18:05
Shorten the retry pause in your clients too as was previously suggested. Does that help?
On Jan 14, 2011, at 9:29, Jonathan Gray <[EMAIL PROTECTED]> wrote: > These are a different kind of pause (those caused by blockingStoreFiles). > > This is HBase stepping in and actually blocking updates to a region because compactions have not been able to keep up with the write load. It could manifest itself in the same way but this is different than shorter pauses caused by periodic offlining of regions during balancing and splits. > > Wayne, have you confirmed in your RegionServer logs that the pauses are associated with splits or region movement, and that you are not seeing the blocking store files issue? > > JG > >> -----Original Message----- >> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher >> Tarnas >> Sent: Friday, January 14, 2011 7:29 AM >> To: [EMAIL PROTECTED] >> Subject: Re: Cluster Wide Pauses >> >> I have been seeing similar problems and found by raising the >> hbase.hregion.memstore.block.multiplier >> to above 12 (default is two) and the hbase.hstore.blockingStoreFiles to 16 I >> managed to reduce the frequency of the pauses during loads. My nodes are >> pretty beefy (48 GB of ram) so I had room to experiment. >> >> From what I understand that gave the regionservers more buffer before >> they had to halt the world to catch up. The pauses still happen but their >> impact is less now. >> >> -chris >> >> On Fri, Jan 14, 2011 at 8:34 AM, Wayne <[EMAIL PROTECTED]> wrote: >> >>> We have not found any smoking gun here. Most likely these are region >>> splits on a quickly growing/hot region that all clients get caught waiting for. >>> >>> >>> On Thu, Jan 13, 2011 at 7:49 AM, Wayne <[EMAIL PROTECTED]> wrote: >>> >>>> Thank you for the lead! We will definitely look closer at the OS logs. >>>> >>>> >>>> On Thu, Jan 13, 2011 at 6:59 AM, Tatsuya Kawano >>>> <[EMAIL PROTECTED] >>>> wrote: >>>> >>>>> >>>>> Hi Wayne, >>>>> >>>>>> We are seeing some TCP Resets on all nodes at the same time, and >>>>> sometimes >>>>>> quite a lot of them. >>>>> >>>>> >>>>> Have you checked this article from Andrei and Cosmin? They had a >>>>> busy firewall to cause network blackout. >>>>> >>>>> http://hstack.org/hbase-performance-testing/ >>>>> >>>>> Maybe it's not your case but just for sure. >>>>> >>>>> Thanks, >>>>> >>>>> -- >>>>> Tatsuya Kawano (Mr.) >>>>> Tokyo, Japan >>>>> >>>>> >>>>> On Jan 13, 2011, at 4:52 AM, Wayne <[EMAIL PROTECTED]> wrote: >>>>> >>>>>> We are seeing some TCP Resets on all nodes at the same time, and >>>>> sometimes >>>>>> quite a lot of them. We have yet to correlate the pauses to the >>>>>> TCP >>>>> resets >>>>>> but I am starting to wonder if this is partly a network problem. >>>>>> Does Gigabit Ethernet break down on high volume nodes? Do high >>>>>> volume nodes >>>>> use >>>>>> 10G or Infiniband? >>>>>> >>>>>> >>>>>> On Wed, Jan 12, 2011 at 1:52 PM, Stack <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>>> Jon asks that you describe your loading in the issue. Would you >>>>>>> mind doing so. Ted, stick up in the issue the workload and >>>>>>> configs. you are running if you don't mind. I'd like to try it over here. >>>>>>> Thanks lads, >>>>>>> St.Ack >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 12, 2011 at 9:03 AM, Wayne <[EMAIL PROTECTED]> >> wrote: >>>>>>>> Added: https://issues.apache.org/jira/browse/HBASE-3438. >>>>>>>> >>>>>>>> On Wed, Jan 12, 2011 at 11:40 AM, Wayne <[EMAIL PROTECTED]> >> wrote: >>>>>>>> >>>>>>>>> We are using 0.89.20100924, r1001068 >>>>>>>>> >>>>>>>>> We are seeing see it during heavy write load (which is all the >>> time), >>>>>>> but >>>>>>>>> yesterday we had read load as well as write load and saw both >>>>>>>>> reads >>>>> and >>>>>>>>> writes stop for 10+ seconds. The region size is the biggest >>>>>>>>> clue we >>>>> have >>>>>>>>> found from our tests as setting up a new cluster with a 1GB >>>>>>>>> max >>>>> region >>>>>>> size >>>>>>>>> and starting to load heavily we will see this a lot for long
-
RE: Cluster Wide PausesGeoff Hendrey 2011-01-14, 18:06
This is not an answer to your question, but just an anecdote on cluster
pauses/slowdowns. We had horrible problems with cluster wide pauses. I think there were several keys to getting this resolved: 1) we used the default settings recommended for bulk inserts: http://people.apache.org/~jdcryans/HUG8/HUG8-rawson.pdf 2) we upgraded to hbase 20.6 b/c there was a deadlock bug in prior versions that basically just caused the entire cluster to "go to sleep" Finally, we had a very strange problem which took 3 weeks of debugging to get to the bottom of. I don't expect that this is your problem, but I'll just throw it out there. Most bulk HBase data-producing M/R jobs are going to do some processing, then write the data from the reducer into hbase (using autoflush=false and disabling the WAL). Since the reducers all receive keys in the same order, this causes all the reducers to load the same HBase region simultaneously. We had this "great idea" that if we reversed the keys that we wrote out of our mapper, then un-reversed them in the reducer, that our reducers would be randomly writing to different region servers, not hitting a single region in lock step. Now, I have some theories on why this seemingly innocuous approach repeatedly destroyed our entire Hbase database. I won't wax philosophical here, but one thing is certain: Any table created via batch inserts of randomized keys got totally hosed. Scans became dirt slow and compactions ran constantly, even *days* after the table was created. None of these problems made a whole lot of sense, which is why it took 3-4 weeks of debugging for us to back this "key randomizing" out of our code. The hosed tables, actually had to be dropped for the problem, and ensuing chaos to totally abate. Until we dropped the tables, if the region server logs showed constant compaction. Like I said, it sounds crazy, but this definitely was the cause of our problem. I'm fully expecting a lot of "your crazy" responses to this email, but we repeatedly reproduced the issue, and the fix was to stop the "key reversing". We just had to live with all the reducers loading individual regions in lock step, as this was really not a big deal (at least not as big a deal as hosing the entire installation). -g -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher Tarnas Sent: Friday, January 14, 2011 9:54 AM To: [EMAIL PROTECTED] Subject: Re: Cluster Wide Pauses Thanks - I was not sure and had not received a response from the list on my related question earlier this week. It does seem like compactions are related to my problem, and if I understand correctly does raising hbase.hregion.memstore.block.multiplier give it more of a buffer for that before writes are blocked while compactions happen? I'm writing via thrift (about 30 clients) to a 5 node cluster when I see this problem. There is no io wait so I don't think it is disk bound, and it is not CPU starved. I'm waiting on IT to get me access to ganglia for the network info. -chris On Fri, Jan 14, 2011 at 11:29 AM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > These are a different kind of pause (those caused by blockingStoreFiles). > > This is HBase stepping in and actually blocking updates to a region because > compactions have not been able to keep up with the write load. It could > manifest itself in the same way but this is different than shorter pauses > caused by periodic offlining of regions during balancing and splits. > > Wayne, have you confirmed in your RegionServer logs that the pauses are > associated with splits or region movement, and that you are not seeing the > blocking store files issue? > > JG > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher > > Tarnas > > Sent: Friday, January 14, 2011 7:29 AM > > To: [EMAIL PROTECTED] > > Subject: Re: Cluster Wide Pauses > > > > I have been seeing similar problems and found by raising the > > hbase.hregion.memstore.block.multiplier to nodes are before their region and a and the problem. high wrote: you it the both biggest 1GB long with a not not the a stuck are faster in. data zero. are elicit nodes. but server reads the This region larger and a the an biggest turned cluster
-
Re: Cluster Wide PausesTodd Lipcon 2011-01-27, 03:51
Hey all,
I spent some time this afternoon looking into this issue and think I found a good culprit: https://issues.apache.org/jira/browse/HBASE-3483 If you've been having this problem, please watch that JIRA for a patch to try (the one up there now is OK but not great). -Todd On Fri, Jan 14, 2011 at 10:06 AM, Geoff Hendrey <[EMAIL PROTECTED]>wrote: > This is not an answer to your question, but just an anecdote on cluster > pauses/slowdowns. We had horrible problems with cluster wide pauses. I > think there were several keys to getting this resolved: > > 1) we used the default settings recommended for bulk inserts: > http://people.apache.org/~jdcryans/HUG8/HUG8-rawson.pdf > 2) we upgraded to hbase 20.6 b/c there was a deadlock bug in prior > versions that basically just caused the entire cluster to "go to sleep" > > Finally, we had a very strange problem which took 3 weeks of debugging > to get to the bottom of. I don't expect that this is your problem, but > I'll just throw it out there. Most bulk HBase data-producing M/R jobs > are going to do some processing, then write the data from the reducer > into hbase (using autoflush=false and disabling the WAL). Since the > reducers all receive keys in the same order, this causes all the > reducers to load the same HBase region simultaneously. We had this > "great idea" that if we reversed the keys that we wrote out of our > mapper, then un-reversed them in the reducer, that our reducers would be > randomly writing to different region servers, not hitting a single > region in lock step. Now, I have some theories on why this seemingly > innocuous approach repeatedly destroyed our entire Hbase database. I > won't wax philosophical here, but one thing is certain: Any table > created via batch inserts of randomized keys got totally hosed. Scans > became dirt slow and compactions ran constantly, even *days* after the > table was created. None of these problems made a whole lot of sense, > which is why it took 3-4 weeks of debugging for us to back this "key > randomizing" out of our code. The hosed tables, actually had to be > dropped for the problem, and ensuing chaos to totally abate. Until we > dropped the tables, if the region server logs showed constant > compaction. Like I said, it sounds crazy, but this definitely was the > cause of our problem. I'm fully expecting a lot of "your crazy" > responses to this email, but we repeatedly reproduced the issue, and the > fix was to stop the "key reversing". We just had to live with all the > reducers loading individual regions in lock step, as this was really not > a big deal (at least not as big a deal as hosing the entire > installation). > > -g > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher > Tarnas > Sent: Friday, January 14, 2011 9:54 AM > To: [EMAIL PROTECTED] > Subject: Re: Cluster Wide Pauses > > Thanks - I was not sure and had not received a response from the list on > my > related question earlier this week. > > It does seem like compactions are related to my problem, and if I > understand > correctly does raising hbase.hregion.memstore.block.multiplier give it > more > of a buffer for that before writes are blocked while compactions happen? > I'm > writing via thrift (about 30 clients) to a 5 node cluster when I see > this > problem. There is no io wait so I don't think it is disk bound, and it > is > not CPU starved. I'm waiting on IT to get me access to ganglia for the > network info. > > -chris > > On Fri, Jan 14, 2011 at 11:29 AM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > > > These are a different kind of pause (those caused by > blockingStoreFiles). > > > > This is HBase stepping in and actually blocking updates to a region > because > > compactions have not been able to keep up with the write load. It > could > > manifest itself in the same way but this is different than shorter > pauses > > caused by periodic offlining of regions during balancing and splits. Todd Lipcon Software Engineer, Cloudera
-
Re: Cluster Wide PausesWayne 2011-01-27, 14:54
This is great, thanks. We had narrowed down our problem to a memstore flush,
and this confirms why. On Wed, Jan 26, 2011 at 10:51 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > Hey all, > > I spent some time this afternoon looking into this issue and think I found > a > good culprit: > https://issues.apache.org/jira/browse/HBASE-3483 > > If you've been having this problem, please watch that JIRA for a patch to > try (the one up there now is OK but not great). > > -Todd > > On Fri, Jan 14, 2011 at 10:06 AM, Geoff Hendrey <[EMAIL PROTECTED] > >wrote: > > > This is not an answer to your question, but just an anecdote on cluster > > pauses/slowdowns. We had horrible problems with cluster wide pauses. I > > think there were several keys to getting this resolved: > > > > 1) we used the default settings recommended for bulk inserts: > > http://people.apache.org/~jdcryans/HUG8/HUG8-rawson.pdf<http://people.apache.org/%7Ejdcryans/HUG8/HUG8-rawson.pdf> > > 2) we upgraded to hbase 20.6 b/c there was a deadlock bug in prior > > versions that basically just caused the entire cluster to "go to sleep" > > > > Finally, we had a very strange problem which took 3 weeks of debugging > > to get to the bottom of. I don't expect that this is your problem, but > > I'll just throw it out there. Most bulk HBase data-producing M/R jobs > > are going to do some processing, then write the data from the reducer > > into hbase (using autoflush=false and disabling the WAL). Since the > > reducers all receive keys in the same order, this causes all the > > reducers to load the same HBase region simultaneously. We had this > > "great idea" that if we reversed the keys that we wrote out of our > > mapper, then un-reversed them in the reducer, that our reducers would be > > randomly writing to different region servers, not hitting a single > > region in lock step. Now, I have some theories on why this seemingly > > innocuous approach repeatedly destroyed our entire Hbase database. I > > won't wax philosophical here, but one thing is certain: Any table > > created via batch inserts of randomized keys got totally hosed. Scans > > became dirt slow and compactions ran constantly, even *days* after the > > table was created. None of these problems made a whole lot of sense, > > which is why it took 3-4 weeks of debugging for us to back this "key > > randomizing" out of our code. The hosed tables, actually had to be > > dropped for the problem, and ensuing chaos to totally abate. Until we > > dropped the tables, if the region server logs showed constant > > compaction. Like I said, it sounds crazy, but this definitely was the > > cause of our problem. I'm fully expecting a lot of "your crazy" > > responses to this email, but we repeatedly reproduced the issue, and the > > fix was to stop the "key reversing". We just had to live with all the > > reducers loading individual regions in lock step, as this was really not > > a big deal (at least not as big a deal as hosing the entire > > installation). > > > > -g > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Christopher > > Tarnas > > Sent: Friday, January 14, 2011 9:54 AM > > To: [EMAIL PROTECTED] > > Subject: Re: Cluster Wide Pauses > > > > Thanks - I was not sure and had not received a response from the list on > > my > > related question earlier this week. > > > > It does seem like compactions are related to my problem, and if I > > understand > > correctly does raising hbase.hregion.memstore.block.multiplier give it > > more > > of a buffer for that before writes are blocked while compactions happen? > > I'm > > writing via thrift (about 30 clients) to a 5 node cluster when I see > > this > > problem. There is no io wait so I don't think it is disk bound, and it > > is > > not CPU starved. I'm waiting on IT to get me access to ganglia for the > > network info. > > > > -chris > > > > On Fri, Jan 14, 2011 at 11:29 AM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > > > > |