|
Varun Sharma
2012-12-05, 23:51
Varun Sharma
2012-12-06, 00:05
lars hofhansl
2012-12-06, 00:20
Varun Sharma
2012-12-06, 00:37
Varun Sharma
2012-12-06, 00:40
Anoop Sam John
2012-12-06, 04:25
ramkrishna vasudevan
2012-12-06, 06:04
Varun Sharma
2012-12-06, 10:32
ramkrishna vasudevan
2012-12-06, 10:59
Varun Sharma
2012-12-06, 11:29
ramkrishna vasudevan
2012-12-06, 11:52
|
-
.META. region server DDOSed by too many clientsVarun Sharma 2012-12-05, 23:51
Hi,
I am running hbase 0.94.0 and I have a significant write load being put on a table with 98 regions on a 15 node cluster - also this write load comes from a very large number of clients (~ 1000). I am running with 10 priority IPC handlers and 200 IPC handlers. It seems the region server holding .META is DDOSed. All the 200 handlers are busy serving the .META. region and they are all locked onto on object. The Jstack is here for the regoin server "IPC Server handler 182 on 60020" daemon prio=10 tid=0x00007f329872c800 nid=0x4401 waiting on condition [0x00007f328807f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000542d72e30> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) - locked <0x000000063b4965d0> (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) - locked <0x000000063b4965d0> (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310) - locked <0x0000000523c211e0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3327) - locked <0x0000000523c211e0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4066) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4039) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1941) The client side trace shows that we are looking for META region. thrift-worker-3499" daemon prio=10 tid=0x00007f789dd98800 nid=0xb52 waiting for monitor entry [0x00007f778672d000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:943) - waiting to lock <0x0000000707978298> (a java.lang.Object) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:729) - locked <0x000000070821d5a0> (a org.apache.hadoop.hbase.client.HTable) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:698) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:371) On the RS page, I see 68 million read requests for the META region while for the other 98 regions - we have done like 20 million write requests in total - regions have not moved around at all and no crashes have happened. Why do we have such an incredible number of scans over META and is there something I can do about this issue ? Varun
-
Re: .META. region server DDOSed by too many clientsVarun Sharma 2012-12-06, 00:05
I am looking at a load of 4K QPS with 500 row updates in each RPC (so
almost 60K row updates) - I am using the batch interface so these should get grouped into groups so that we only do 15 RPC(s) for each region server - this is when the META is most needed. AFAIK, its cached at the client side. Also, is .META. specifically kept in memory - it seems the region server is stuck in doing a seek - is this a sign of .META not being cached ? (PS: I have one in memory column family - the block cache has 1G free in it). On Wed, Dec 5, 2012 at 3:51 PM, Varun Sharma <[EMAIL PROTECTED]> wrote: > Hi, > > I am running hbase 0.94.0 and I have a significant write load being put on > a table with 98 regions on a 15 node cluster - also this write load comes > from a very large number of clients (~ 1000). I am running with 10 priority > IPC handlers and 200 IPC handlers. It seems the region server holding .META > is DDOSed. All the 200 handlers are busy serving the .META. region and they > are all locked onto on object. The Jstack is here for the regoin server > > "IPC Server handler 182 on 60020" daemon prio=10 tid=0x00007f329872c800 > nid=0x4401 waiting on condition [0x00007f328807f000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0000000542d72e30> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) > at > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > at > java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) > at > java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) > at > java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) > at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) > at > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) > at > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) > - locked <0x000000063b4965d0> (a > org.apache.hadoop.hbase.regionserver.StoreScanner) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) > - locked <0x000000063b4965d0> (a > org.apache.hadoop.hbase.regionserver.StoreScanner) > at > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310) > - locked <0x0000000523c211e0> (a > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
-
Re: .META. region server DDOSed by too many clientslars hofhansl 2012-12-06, 00:20
Looks like you're running into HBASE-5898.
----- Original Message ----- From: Varun Sharma <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: Sent: Wednesday, December 5, 2012 3:51 PM Subject: .META. region server DDOSed by too many clients Hi, I am running hbase 0.94.0 and I have a significant write load being put on a table with 98 regions on a 15 node cluster - also this write load comes from a very large number of clients (~ 1000). I am running with 10 priority IPC handlers and 200 IPC handlers. It seems the region server holding .META is DDOSed. All the 200 handlers are busy serving the .META. region and they are all locked onto on object. The Jstack is here for the regoin server "IPC Server handler 182 on 60020" daemon prio=10 tid=0x00007f329872c800 nid=0x4401 waiting on condition [0x00007f328807f000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000542d72e30> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) at java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) at org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) - locked <0x000000063b4965d0> (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) - locked <0x000000063b4965d0> (a org.apache.hadoop.hbase.regionserver.StoreScanner) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310) - locked <0x0000000523c211e0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3327) - locked <0x0000000523c211e0> (a org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4066) at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4039) at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1941) The client side trace shows that we are looking for META region. thrift-worker-3499" daemon prio=10 tid=0x00007f789dd98800 nid=0xb52 waiting for monitor entry [0x00007f778672d000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:943) - waiting to lock <0x0000000707978298> (a java.lang.Object) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:729) - locked <0x000000070821d5a0> (a org.apache.hadoop.hbase.client.HTable) at org.apache.hadoop.hbase.client.HTable.get(HTable.java:698) at org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:371) On the RS page, I see 68 million read requests for the META region while for the other 98 regions - we have done like 20 million write requests in total - regions have not moved around at all and no crashes have happened. Why do we have such an incredible number of scans over META and is there something I can do about this issue ? Varun
-
Re: .META. region server DDOSed by too many clientsVarun Sharma 2012-12-06, 00:37
I see but is this pointing to the fact that we are heading to disk for
scanning META - if yes, that would be pretty bad, no ? Currently I am trying to see if the freeze coincides with Block Cache being full (we have an inmemory column) - is the META table cached just like other tables ? Varun On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Looks like you're running into HBASE-5898. > > > > ----- Original Message ----- > From: Varun Sharma <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: > Sent: Wednesday, December 5, 2012 3:51 PM > Subject: .META. region server DDOSed by too many clients > > Hi, > > I am running hbase 0.94.0 and I have a significant write load being put on > a table with 98 regions on a 15 node cluster - also this write load comes > from a very large number of clients (~ 1000). I am running with 10 priority > IPC handlers and 200 IPC handlers. It seems the region server holding .META > is DDOSed. All the 200 handlers are busy serving the .META. region and they > are all locked onto on object. The Jstack is here for the regoin server > > "IPC Server handler 182 on 60020" daemon prio=10 tid=0x00007f329872c800 > nid=0x4401 waiting on condition [0x00007f328807f000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x0000000542d72e30> (a > java.util.concurrent.locks.ReentrantLock$NonfairSync) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) > at > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) > at > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > at > java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) > at > > java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) > at > > java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) > at org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) > at > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) > at > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) > at > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) > at > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) > at > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) > at > > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) > at > > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) > at > > org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) > at > > org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) > at > > org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) > - locked <0x000000063b4965d0> (a > org.apache.hadoop.hbase.regionserver.StoreScanner) > at > > org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) > - locked <0x000000063b4965d0> (a > org.apache.hadoop.hbase.regionserver.StoreScanner) > at > > org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127) > at > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354) > at > > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310)
-
Re: .META. region server DDOSed by too many clientsVarun Sharma 2012-12-06, 00:40
We only see this on the .META. region not otherwise...
On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[EMAIL PROTECTED]> wrote: > I see but is this pointing to the fact that we are heading to disk for > scanning META - if yes, that would be pretty bad, no ? Currently I am > trying to see if the freeze coincides with Block Cache being full (we have > an inmemory column) - is the META table cached just like other tables ? > > Varun > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > >> Looks like you're running into HBASE-5898. >> >> >> >> ----- Original Message ----- >> From: Varun Sharma <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Cc: >> Sent: Wednesday, December 5, 2012 3:51 PM >> Subject: .META. region server DDOSed by too many clients >> >> Hi, >> >> I am running hbase 0.94.0 and I have a significant write load being put on >> a table with 98 regions on a 15 node cluster - also this write load comes >> from a very large number of clients (~ 1000). I am running with 10 >> priority >> IPC handlers and 200 IPC handlers. It seems the region server holding >> .META >> is DDOSed. All the 200 handlers are busy serving the .META. region and >> they >> are all locked onto on object. The Jstack is here for the regoin server >> >> "IPC Server handler 182 on 60020" daemon prio=10 tid=0x00007f329872c800 >> nid=0x4401 waiting on condition [0x00007f328807f000] >> java.lang.Thread.State: WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for <0x0000000542d72e30> (a >> java.util.concurrent.locks.ReentrantLock$NonfairSync) >> at >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) >> at >> >> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) >> at >> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) >> at >> >> java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) >> at >> >> java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) >> at >> org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) >> at >> >> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) >> at >> >> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) >> at >> >> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) >> at >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) >> at >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) >> at >> >> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) >> - locked <0x000000063b4965d0> (a >> org.apache.hadoop.hbase.regionserver.StoreScanner) >> at >> >> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402) >> - locked <0x000000063b4965d0> (a >> org.apache.hadoop.hbase.regionserver.StoreScanner) >> at >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
-
RE: .META. region server DDOSed by too many clientsAnoop Sam John 2012-12-06, 04:25
>is the META table cached just like other tables Yes Varun I think so. -Anoop- ________________________________________ From: Varun Sharma [[EMAIL PROTECTED]] Sent: Thursday, December 06, 2012 6:10 AM To: [EMAIL PROTECTED]; lars hofhansl Subject: Re: .META. region server DDOSed by too many clients We only see this on the .META. region not otherwise... On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[EMAIL PROTECTED]> wrote: > I see but is this pointing to the fact that we are heading to disk for > scanning META - if yes, that would be pretty bad, no ? Currently I am > trying to see if the freeze coincides with Block Cache being full (we have > an inmemory column) - is the META table cached just like other tables ? > > Varun > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > >> Looks like you're running into HBASE-5898. >> >> >> >> ----- Original Message ----- >> From: Varun Sharma <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Cc: >> Sent: Wednesday, December 5, 2012 3:51 PM >> Subject: .META. region server DDOSed by too many clients >> >> Hi, >> >> I am running hbase 0.94.0 and I have a significant write load being put on >> a table with 98 regions on a 15 node cluster - also this write load comes >> from a very large number of clients (~ 1000). I am running with 10 >> priority >> IPC handlers and 200 IPC handlers. It seems the region server holding >> .META >> is DDOSed. All the 200 handlers are busy serving the .META. region and >> they >> are all locked onto on object. The Jstack is here for the regoin server >> >> "IPC Server handler 182 on 60020" daemon prio=10 tid=0x00007f329872c800 >> nid=0x4401 waiting on condition [0x00007f328807f000] >> java.lang.Thread.State: WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for <0x0000000542d72e30> (a >> java.util.concurrent.locks.ReentrantLock$NonfairSync) >> at >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) >> at >> >> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) >> at >> >> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) >> at >> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) >> at >> >> java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) >> at >> >> java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) >> at >> org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) >> at >> >> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) >> at >> >> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) >> at >> >> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) >> at >> >> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54) >> at >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299) >> at >> >> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244) >> at >> >> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521) >> - locked <0x000000063b4965d0> (a
-
Re: .META. region server DDOSed by too many clientsramkrishna vasudevan 2012-12-06, 06:04
Is block cache ON? Check out HBASe-5898?
Regards Ram On Thu, Dec 6, 2012 at 9:55 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote: > > >is the META table cached just like other tables > Yes Varun I think so. > > -Anoop- > ________________________________________ > From: Varun Sharma [[EMAIL PROTECTED]] > Sent: Thursday, December 06, 2012 6:10 AM > To: [EMAIL PROTECTED]; lars hofhansl > Subject: Re: .META. region server DDOSed by too many clients > > We only see this on the .META. region not otherwise... > > On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[EMAIL PROTECTED]> wrote: > > > I see but is this pointing to the fact that we are heading to disk for > > scanning META - if yes, that would be pretty bad, no ? Currently I am > > trying to see if the freeze coincides with Block Cache being full (we > have > > an inmemory column) - is the META table cached just like other tables ? > > > > Varun > > > > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[EMAIL PROTECTED]> > wrote: > > > >> Looks like you're running into HBASE-5898. > >> > >> > >> > >> ----- Original Message ----- > >> From: Varun Sharma <[EMAIL PROTECTED]> > >> To: [EMAIL PROTECTED] > >> Cc: > >> Sent: Wednesday, December 5, 2012 3:51 PM > >> Subject: .META. region server DDOSed by too many clients > >> > >> Hi, > >> > >> I am running hbase 0.94.0 and I have a significant write load being put > on > >> a table with 98 regions on a 15 node cluster - also this write load > comes > >> from a very large number of clients (~ 1000). I am running with 10 > >> priority > >> IPC handlers and 200 IPC handlers. It seems the region server holding > >> .META > >> is DDOSed. All the 200 handlers are busy serving the .META. region and > >> they > >> are all locked onto on object. The Jstack is here for the regoin server > >> > >> "IPC Server handler 182 on 60020" daemon prio=10 tid=0x00007f329872c800 > >> nid=0x4401 waiting on condition [0x00007f328807f000] > >> java.lang.Thread.State: WAITING (parking) > >> at sun.misc.Unsafe.park(Native Method) > >> - parking to wait for <0x0000000542d72e30> (a > >> java.util.concurrent.locks.ReentrantLock$NonfairSync) > >> at > >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > >> at > >> > >> > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > >> at > >> > >> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) > >> at > >> > >> > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) > >> at > >> > >> > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > >> at > >> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) > >> at > >> > >> > java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) > >> at > >> > >> > java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) > >> at > >> org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) > >> at > >> > >> > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) > >> at > >> > >> > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213) > >> at > >> > >> > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455) > >> at > >> > >> > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493) > >> at > >> > >> > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242) > >> at > >> > >> > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167) > >> at > >> > >> > org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
-
Re: .META. region server DDOSed by too many clientsVarun Sharma 2012-12-06, 10:32
Hi Ram,
Yes BlockCache is on but there is another in memory column which might be preempting the stuff from block cache. So, we might be hitting more disk seeks - I see that you have seen this trace before on HBASE 5898 - did that issue resolve things for you ? Thanks Varun On Wed, Dec 5, 2012 at 10:04 PM, ramkrishna vasudevan < [EMAIL PROTECTED]> wrote: > Is block cache ON? Check out HBASe-5898? > > Regards > Ram > > On Thu, Dec 6, 2012 at 9:55 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote: > > > > > >is the META table cached just like other tables > > Yes Varun I think so. > > > > -Anoop- > > ________________________________________ > > From: Varun Sharma [[EMAIL PROTECTED]] > > Sent: Thursday, December 06, 2012 6:10 AM > > To: [EMAIL PROTECTED]; lars hofhansl > > Subject: Re: .META. region server DDOSed by too many clients > > > > We only see this on the .META. region not otherwise... > > > > On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[EMAIL PROTECTED]> > wrote: > > > > > I see but is this pointing to the fact that we are heading to disk for > > > scanning META - if yes, that would be pretty bad, no ? Currently I am > > > trying to see if the freeze coincides with Block Cache being full (we > > have > > > an inmemory column) - is the META table cached just like other tables ? > > > > > > Varun > > > > > > > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[EMAIL PROTECTED]> > > wrote: > > > > > >> Looks like you're running into HBASE-5898. > > >> > > >> > > >> > > >> ----- Original Message ----- > > >> From: Varun Sharma <[EMAIL PROTECTED]> > > >> To: [EMAIL PROTECTED] > > >> Cc: > > >> Sent: Wednesday, December 5, 2012 3:51 PM > > >> Subject: .META. region server DDOSed by too many clients > > >> > > >> Hi, > > >> > > >> I am running hbase 0.94.0 and I have a significant write load being > put > > on > > >> a table with 98 regions on a 15 node cluster - also this write load > > comes > > >> from a very large number of clients (~ 1000). I am running with 10 > > >> priority > > >> IPC handlers and 200 IPC handlers. It seems the region server holding > > >> .META > > >> is DDOSed. All the 200 handlers are busy serving the .META. region and > > >> they > > >> are all locked onto on object. The Jstack is here for the regoin > server > > >> > > >> "IPC Server handler 182 on 60020" daemon prio=10 > tid=0x00007f329872c800 > > >> nid=0x4401 waiting on condition [0x00007f328807f000] > > >> java.lang.Thread.State: WAITING (parking) > > >> at sun.misc.Unsafe.park(Native Method) > > >> - parking to wait for <0x0000000542d72e30> (a > > >> java.util.concurrent.locks.ReentrantLock$NonfairSync) > > >> at > > >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > >> at > > >> > > >> > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > > >> at > > >> > > >> > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) > > >> at > > >> > > >> > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) > > >> at > > >> > > >> > > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > > >> at > > >> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) > > >> at > > >> > > >> > > > java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445) > > >> at > > >> > > >> > > > java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925) > > >> at > > >> org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290) > > >> at > > >> > > >> > > > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213)
-
Re: .META. region server DDOSed by too many clientsramkrishna vasudevan 2012-12-06, 10:59
Actually when we observed that our block cache was OFF... If possible try
applying your patch and see what is happening? If you have more memory just trying increasing the ratio allocated to block cache? Regards Ralm On Thu, Dec 6, 2012 at 4:02 PM, Varun Sharma <[EMAIL PROTECTED]> wrote: > Hi Ram, > > Yes BlockCache is on but there is another in memory column which might be > preempting the stuff from block cache. So, we might be hitting more disk > seeks - I see that you have seen this trace before on HBASE 5898 - did that > issue resolve things for you ? > > Thanks > Varun > > On Wed, Dec 5, 2012 at 10:04 PM, ramkrishna vasudevan < > [EMAIL PROTECTED]> wrote: > > > Is block cache ON? Check out HBASe-5898? > > > > Regards > > Ram > > > > On Thu, Dec 6, 2012 at 9:55 AM, Anoop Sam John <[EMAIL PROTECTED]> > wrote: > > > > > > > > >is the META table cached just like other tables > > > Yes Varun I think so. > > > > > > -Anoop- > > > ________________________________________ > > > From: Varun Sharma [[EMAIL PROTECTED]] > > > Sent: Thursday, December 06, 2012 6:10 AM > > > To: [EMAIL PROTECTED]; lars hofhansl > > > Subject: Re: .META. region server DDOSed by too many clients > > > > > > We only see this on the .META. region not otherwise... > > > > > > On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[EMAIL PROTECTED]> > > wrote: > > > > > > > I see but is this pointing to the fact that we are heading to disk > for > > > > scanning META - if yes, that would be pretty bad, no ? Currently I am > > > > trying to see if the freeze coincides with Block Cache being full (we > > > have > > > > an inmemory column) - is the META table cached just like other > tables ? > > > > > > > > Varun > > > > > > > > > > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[EMAIL PROTECTED]> > > > wrote: > > > > > > > >> Looks like you're running into HBASE-5898. > > > >> > > > >> > > > >> > > > >> ----- Original Message ----- > > > >> From: Varun Sharma <[EMAIL PROTECTED]> > > > >> To: [EMAIL PROTECTED] > > > >> Cc: > > > >> Sent: Wednesday, December 5, 2012 3:51 PM > > > >> Subject: .META. region server DDOSed by too many clients > > > >> > > > >> Hi, > > > >> > > > >> I am running hbase 0.94.0 and I have a significant write load being > > put > > > on > > > >> a table with 98 regions on a 15 node cluster - also this write load > > > comes > > > >> from a very large number of clients (~ 1000). I am running with 10 > > > >> priority > > > >> IPC handlers and 200 IPC handlers. It seems the region server > holding > > > >> .META > > > >> is DDOSed. All the 200 handlers are busy serving the .META. region > and > > > >> they > > > >> are all locked onto on object. The Jstack is here for the regoin > > server > > > >> > > > >> "IPC Server handler 182 on 60020" daemon prio=10 > > tid=0x00007f329872c800 > > > >> nid=0x4401 waiting on condition [0x00007f328807f000] > > > >> java.lang.Thread.State: WAITING (parking) > > > >> at sun.misc.Unsafe.park(Native Method) > > > >> - parking to wait for <0x0000000542d72e30> (a > > > >> java.util.concurrent.locks.ReentrantLock$NonfairSync) > > > >> at > > > >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > > >> at > > > >> > > > >> > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > > > >> at > > > >> > > > >> > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871) > > > >> at > > > >> > > > >> > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201) > > > >> at > > > >> > > > >> > > > > > > java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214) > > > >> at > > > >> > java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) > > > >> at > > > >> > > > >> > > > > > > java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445)
-
Re: .META. region server DDOSed by too many clientsVarun Sharma 2012-12-06, 11:29
I see - I am going to try the patch then - looks like all the threads have
deadlocked and are holding the lock to the same block integer. The cache hit ratio is pretty high. Also the server is in this state for the past 1 hour - I dont think it should take an hour to load one HDFS block - I am seeing the issue repeatedly - it looks like something is probably wrong with the locking mechanism when you have have higher number of IPC handlers like 200. On Thu, Dec 6, 2012 at 2:59 AM, ramkrishna vasudevan < [EMAIL PROTECTED]> wrote: > Actually when we observed that our block cache was OFF... If possible try > applying your patch and see what is happening? > If you have more memory just trying increasing the ratio allocated to block > cache? > > Regards > Ralm > > On Thu, Dec 6, 2012 at 4:02 PM, Varun Sharma <[EMAIL PROTECTED]> wrote: > > > Hi Ram, > > > > Yes BlockCache is on but there is another in memory column which might be > > preempting the stuff from block cache. So, we might be hitting more disk > > seeks - I see that you have seen this trace before on HBASE 5898 - did > that > > issue resolve things for you ? > > > > Thanks > > Varun > > > > On Wed, Dec 5, 2012 at 10:04 PM, ramkrishna vasudevan < > > [EMAIL PROTECTED]> wrote: > > > > > Is block cache ON? Check out HBASe-5898? > > > > > > Regards > > > Ram > > > > > > On Thu, Dec 6, 2012 at 9:55 AM, Anoop Sam John <[EMAIL PROTECTED]> > > wrote: > > > > > > > > > > > >is the META table cached just like other tables > > > > Yes Varun I think so. > > > > > > > > -Anoop- > > > > ________________________________________ > > > > From: Varun Sharma [[EMAIL PROTECTED]] > > > > Sent: Thursday, December 06, 2012 6:10 AM > > > > To: [EMAIL PROTECTED]; lars hofhansl > > > > Subject: Re: .META. region server DDOSed by too many clients > > > > > > > > We only see this on the .META. region not otherwise... > > > > > > > > On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > I see but is this pointing to the fact that we are heading to disk > > for > > > > > scanning META - if yes, that would be pretty bad, no ? Currently I > am > > > > > trying to see if the freeze coincides with Block Cache being full > (we > > > > have > > > > > an inmemory column) - is the META table cached just like other > > tables ? > > > > > > > > > > Varun > > > > > > > > > > > > > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > >> Looks like you're running into HBASE-5898. > > > > >> > > > > >> > > > > >> > > > > >> ----- Original Message ----- > > > > >> From: Varun Sharma <[EMAIL PROTECTED]> > > > > >> To: [EMAIL PROTECTED] > > > > >> Cc: > > > > >> Sent: Wednesday, December 5, 2012 3:51 PM > > > > >> Subject: .META. region server DDOSed by too many clients > > > > >> > > > > >> Hi, > > > > >> > > > > >> I am running hbase 0.94.0 and I have a significant write load > being > > > put > > > > on > > > > >> a table with 98 regions on a 15 node cluster - also this write > load > > > > comes > > > > >> from a very large number of clients (~ 1000). I am running with 10 > > > > >> priority > > > > >> IPC handlers and 200 IPC handlers. It seems the region server > > holding > > > > >> .META > > > > >> is DDOSed. All the 200 handlers are busy serving the .META. region > > and > > > > >> they > > > > >> are all locked onto on object. The Jstack is here for the regoin > > > server > > > > >> > > > > >> "IPC Server handler 182 on 60020" daemon prio=10 > > > tid=0x00007f329872c800 > > > > >> nid=0x4401 waiting on condition [0x00007f328807f000] > > > > >> java.lang.Thread.State: WAITING (parking) > > > > >> at sun.misc.Unsafe.park(Native Method) > > > > >> - parking to wait for <0x0000000542d72e30> (a > > > > >> java.util.concurrent.locks.ReentrantLock$NonfairSync) > > > > >> at > > > > >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
-
Re: .META. region server DDOSed by too many clientsramkrishna vasudevan 2012-12-06, 11:52
Hmmm..yes....If you see the interactions over the JIRA Stack felt there was
something else there. Check your Hadoop side logs and thread dumps.. That may tel you if your datanodes are bit lazy. :) Regards Ram On Thu, Dec 6, 2012 at 4:59 PM, Varun Sharma <[EMAIL PROTECTED]> wrote: > I see - I am going to try the patch then - looks like all the threads have > deadlocked and are holding the lock to the same block integer. The cache > hit ratio is pretty high. Also the server is in this state for the past 1 > hour - I dont think it should take an hour to load one HDFS block - I am > seeing the issue repeatedly - it looks like something is probably wrong > with the locking mechanism when you have have higher number of IPC handlers > like 200. > > On Thu, Dec 6, 2012 at 2:59 AM, ramkrishna vasudevan < > [EMAIL PROTECTED]> wrote: > > > Actually when we observed that our block cache was OFF... If possible try > > applying your patch and see what is happening? > > If you have more memory just trying increasing the ratio allocated to > block > > cache? > > > > Regards > > Ralm > > > > On Thu, Dec 6, 2012 at 4:02 PM, Varun Sharma <[EMAIL PROTECTED]> > wrote: > > > > > Hi Ram, > > > > > > Yes BlockCache is on but there is another in memory column which might > be > > > preempting the stuff from block cache. So, we might be hitting more > disk > > > seeks - I see that you have seen this trace before on HBASE 5898 - did > > that > > > issue resolve things for you ? > > > > > > Thanks > > > Varun > > > > > > On Wed, Dec 5, 2012 at 10:04 PM, ramkrishna vasudevan < > > > [EMAIL PROTECTED]> wrote: > > > > > > > Is block cache ON? Check out HBASe-5898? > > > > > > > > Regards > > > > Ram > > > > > > > > On Thu, Dec 6, 2012 at 9:55 AM, Anoop Sam John <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > > > > > > >is the META table cached just like other tables > > > > > Yes Varun I think so. > > > > > > > > > > -Anoop- > > > > > ________________________________________ > > > > > From: Varun Sharma [[EMAIL PROTECTED]] > > > > > Sent: Thursday, December 06, 2012 6:10 AM > > > > > To: [EMAIL PROTECTED]; lars hofhansl > > > > > Subject: Re: .META. region server DDOSed by too many clients > > > > > > > > > > We only see this on the .META. region not otherwise... > > > > > > > > > > On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <[EMAIL PROTECTED]> > > > > wrote: > > > > > > > > > > > I see but is this pointing to the fact that we are heading to > disk > > > for > > > > > > scanning META - if yes, that would be pretty bad, no ? Currently > I > > am > > > > > > trying to see if the freeze coincides with Block Cache being full > > (we > > > > > have > > > > > > an inmemory column) - is the META table cached just like other > > > tables ? > > > > > > > > > > > > Varun > > > > > > > > > > > > > > > > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl < > [EMAIL PROTECTED] > > > > > > > > wrote: > > > > > > > > > > > >> Looks like you're running into HBASE-5898. > > > > > >> > > > > > >> > > > > > >> > > > > > >> ----- Original Message ----- > > > > > >> From: Varun Sharma <[EMAIL PROTECTED]> > > > > > >> To: [EMAIL PROTECTED] > > > > > >> Cc: > > > > > >> Sent: Wednesday, December 5, 2012 3:51 PM > > > > > >> Subject: .META. region server DDOSed by too many clients > > > > > >> > > > > > >> Hi, > > > > > >> > > > > > >> I am running hbase 0.94.0 and I have a significant write load > > being > > > > put > > > > > on > > > > > >> a table with 98 regions on a 15 node cluster - also this write > > load > > > > > comes > > > > > >> from a very large number of clients (~ 1000). I am running with > 10 > > > > > >> priority > > > > > >> IPC handlers and 200 IPC handlers. It seems the region server > > > holding > > > > > >> .META > > > > > >> is DDOSed. All the 200 handlers are busy serving the .META. > region > > > and > > > > > >> they > > > > > >> are all locked onto on object. The Jstack is here for the regoin |