Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - REST servers locked up on single RS malfunction.


Copy link to this message
-
Re: REST servers locked up on single RS malfunction.
Jack Levin 2011-04-25, 17:32
Stack:

Exception in thread "pool-1-thread-9" java.lang.OutOfMemoryError: Java
heap space
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120)
at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959)
at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503)
at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Btw, is this put or read?  Perhaps we are crashing on some sort of large read?

-Jack

On Thu, Apr 21, 2011 at 12:47 AM, Jack Levin <[EMAIL PROTECTED]> wrote:
> Shouldn't the RS just shutdown then?  Because it stays half alive and
> none of the puts succeed.  Also the oome happen right after
> flush/compaction/split... so clearly the RS was busy, and it could be
> just a matter of hitting Heap ceiling perhaps.
>
> -Jack
>
> On Thu, Apr 21, 2011 at 12:13 AM, Stack <[EMAIL PROTECTED]> wrote:
>> This looks like a bug.  Elsewhere in the RPC you can register a
>> handler for OOME explicitly and we have a callback up into the
>> regionserver where we will set that the server abort or stop dependent
>> on type of OOME we see.  In this case it looks like on OOME we just
>> throw and the then all the executors fill so no more executors
>> available to process requests (This is my current accessment -- it
>> could be a different one by morning).
>>
>> The root cause would look to be a big put.  Could that be the case.
>>
>> On the naming, that looks to be the default naming of executor threads
>> done by the hosting executorservice.
>>
>> St.Ack
>>
>>
>> On Wed, Apr 20, 2011 at 10:11 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>>> Hello, with 0.89 HBASE, we see the following, all REST servers get
>>> locked on trying to connect to one of our RS servers, the error in the
>>> .out file on that Region Server looks like this:
>>>
>>> Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: Java
>>> heap space
>>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120)
>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959)
>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927)
>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503)
>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297)
>>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>>
>>> Question is, how come the region server did not die after this but
>>> just hogged the REST connections?  And what is pool1-thread-3 actually
>>> do?
>>>
>>> -Jack
>>>
>>
>