Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> REST servers locked up on single RS malfunction.


Copy link to this message
-
Re: REST servers locked up on single RS malfunction.
Shouldn't the RS just shutdown then?  Because it stays half alive and
none of the puts succeed.  Also the oome happen right after
flush/compaction/split... so clearly the RS was busy, and it could be
just a matter of hitting Heap ceiling perhaps.

-Jack

On Thu, Apr 21, 2011 at 12:13 AM, Stack <[EMAIL PROTECTED]> wrote:
> This looks like a bug.  Elsewhere in the RPC you can register a
> handler for OOME explicitly and we have a callback up into the
> regionserver where we will set that the server abort or stop dependent
> on type of OOME we see.  In this case it looks like on OOME we just
> throw and the then all the executors fill so no more executors
> available to process requests (This is my current accessment -- it
> could be a different one by morning).
>
> The root cause would look to be a big put.  Could that be the case.
>
> On the naming, that looks to be the default naming of executor threads
> done by the hosting executorservice.
>
> St.Ack
>
>
> On Wed, Apr 20, 2011 at 10:11 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>> Hello, with 0.89 HBASE, we see the following, all REST servers get
>> locked on trying to connect to one of our RS servers, the error in the
>> .out file on that Region Server looks like this:
>>
>> Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: Java
>> heap space
>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:619)
>>
>> Question is, how come the region server did not die after this but
>> just hogged the REST connections?  And what is pool1-thread-3 actually
>> do?
>>
>> -Jack
>>
>