Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> REST servers locked up on single RS malfunction.


Copy link to this message
-
Re: REST servers locked up on single RS malfunction.
thats a separate cluster, its barely getting any traffic so I don't
think queue would be an issue.   We do however have very large files
stored (file per row).  So question is, if this is a GET that breaks
things, how can we avoid it?

-Jack

On Mon, Apr 25, 2011 at 10:37 AM, Jean-Daniel Cryans
<[EMAIL PROTECTED]> wrote:
> Can't tell what it was because it OOME'd while reading whatever was coming in.
>
> Did you bump the number of handlers in that cluster too? Because you
> might hit what we talked about in this jira:
> https://issues.apache.org/jira/browse/HBASE-3813
>
> "Chatting w/ J-D this morning, he asked if the queues hold 'data'. The
> queues hold 'Calls'. Calls are the client request. They contain data.
> Jack had 2500 items queued. If each item to insert was 1MB, thats 25k
> * 1MB of memory that is outside of our generally accounting."
>
> So the higher the number of handlers the more memory could be used by
> the queues.
>
> J-D
>
> On Mon, Apr 25, 2011 at 10:32 AM, Jack Levin <[EMAIL PROTECTED]> wrote:
>> Stack:
>>
>> Exception in thread "pool-1-thread-9" java.lang.OutOfMemoryError: Java
>> heap space
>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503)
>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>        at java.lang.Thread.run(Thread.java:619)
>>
>> Btw, is this put or read?  Perhaps we are crashing on some sort of large read?
>>
>> -Jack
>>
>> On Thu, Apr 21, 2011 at 12:47 AM, Jack Levin <[EMAIL PROTECTED]> wrote:
>>> Shouldn't the RS just shutdown then?  Because it stays half alive and
>>> none of the puts succeed.  Also the oome happen right after
>>> flush/compaction/split... so clearly the RS was busy, and it could be
>>> just a matter of hitting Heap ceiling perhaps.
>>>
>>> -Jack
>>>
>>> On Thu, Apr 21, 2011 at 12:13 AM, Stack <[EMAIL PROTECTED]> wrote:
>>>> This looks like a bug.  Elsewhere in the RPC you can register a
>>>> handler for OOME explicitly and we have a callback up into the
>>>> regionserver where we will set that the server abort or stop dependent
>>>> on type of OOME we see.  In this case it looks like on OOME we just
>>>> throw and the then all the executors fill so no more executors
>>>> available to process requests (This is my current accessment -- it
>>>> could be a different one by morning).
>>>>
>>>> The root cause would look to be a big put.  Could that be the case.
>>>>
>>>> On the naming, that looks to be the default naming of executor threads
>>>> done by the hosting executorservice.
>>>>
>>>> St.Ack
>>>>
>>>>
>>>> On Wed, Apr 20, 2011 at 10:11 PM, Jack Levin <[EMAIL PROTECTED]> wrote:
>>>>> Hello, with 0.89 HBASE, we see the following, all REST servers get
>>>>> locked on trying to connect to one of our RS servers, the error in the
>>>>> .out file on that Region Server looks like this:
>>>>>
>>>>> Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: Java
>>>>> heap space
>>>>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invocation.readFields(HBaseRPC.java:120)
>>>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:959)
>>>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:927)
>>>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:503)
>>>>>        at org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:297)
>>>>>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)