|
|
-
Re: REST servers locked up on single RS malfunction.Jack Levin 2011-04-25, 17:32
Stack:
Exception in thread "pool-1-thread-9" java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959) at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503) at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Btw, is this put or read? Perhaps we are crashing on some sort of large read? -Jack On Thu, Apr 21, 2011 at 12:47 AM, Jack Levin <[EMAIL PROTECTED]> wrote: > Shouldn't the RS just shutdown then? Because it stays half alive and > none of the puts succeed. Also the oome happen right after > flush/compaction/split... so clearly the RS was busy, and it could be > just a matter of hitting Heap ceiling perhaps. > > -Jack > > On Thu, Apr 21, 2011 at 12:13 AM, Stack <[EMAIL PROTECTED]> wrote: >> This looks like a bug. Elsewhere in the RPC you can register a >> handler for OOME explicitly and we have a callback up into the >> regionserver where we will set that the server abort or stop dependent >> on type of OOME we see. In this case it looks like on OOME we just >> throw and the then all the executors fill so no more executors >> available to process requests (This is my current accessment -- it >> could be a different one by morning). >> >> The root cause would look to be a big put. Could that be the case. >> >> On the naming, that looks to be the default naming of executor threads >> done by the hosting executorservice. >> >> St.Ack >> >> >> On Wed, Apr 20, 2011 at 10:11 PM, Jack Levin <[EMAIL PROTECTED]> wrote: >>> Hello, with 0.89 HBASE, we see the following, all REST servers get >>> locked on trying to connect to one of our RS servers, the error in the >>> .out file on that Region Server looks like this: >>> >>> Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: Java >>> heap space >>> at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120) >>> at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959) >>> at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927) >>> at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503) >>> at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >>> at java.lang.Thread.run(Thread.java:619) >>> >>> Question is, how come the region server did not die after this but >>> just hogged the REST connections? And what is pool1-thread-3 actually >>> do? >>> >>> -Jack >>> >> > |