Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - REST servers locked up on single RS malfunction.

Copy link to this message
Re: REST servers locked up on single RS malfunction.
Jack Levin 2011-04-21, 07:47
Shouldn't the RS just shutdown then?  Because it stays half alive and
none of the puts succeed.  Also the oome happen right after
flush/compaction/split... so clearly the RS was busy, and it could be
just a matter of hitting Heap ceiling perhaps.


On Thu, Apr 21, 2011 at 12:13 AM, Stack <[EMAIL PROTECTED]> wrote:
> This looks like a bug.  Elsewhere in the RPC you can register a
> handler for OOME explicitly and we have a callback up into the
> regionserver where we will set that the server abort or stop dependent
> on type of OOME we see.  In this case it looks like on OOME we just
> throw and the then all the executors fill so no more executors
> available to process requests (This is my current accessment -- it
> could be a different one by morning).
> The root cause would look to be a big put.  Could that be the case.
> On the naming, that looks to be the default naming of executor threads
> done by the hosting executorservice.
> St.Ack
> On Wed, Apr 20, 2011 at 10:11 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>> Hello, with 0.89 HBASE, we see the following, all REST servers get
>> locked on trying to connect to one of our RS servers, the error in the
>> .out file on that Region Server looks like this:
>> Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: Java
>> heap space
>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:619)
>> Question is, how come the region server did not die after this but
>> just hogged the REST connections?  And what is pool1-thread-3 actually
>> do?
>> -Jack