Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Average RPC Queue Time


Copy link to this message
-
Re: Average RPC Queue Time
I'm not sure why it is so much higher than your rpc timeout.  Enabling
DEBUG log level on org.apache.hadoop.ipc.HBaseServer.trace and
org.apache.hadoop.ipc.HBaseServer loggers might provide you with some
insight.
On Wed, Nov 20, 2013 at 12:55 PM, Shawn Hermans <[EMAIL PROTECTED]>wrote:

> Shouldn't be.  Looks like Cloudera just converts it to nicer values.  So
> the actual peak value is 14438088.62 ms for Average RPC queue time.
>
>
> On Wed, Nov 20, 2013 at 11:51 AM, Bryan Beaudreault <
> [EMAIL PROTECTED]> wrote:
>
> > I'm not sure about the cloudera manager ui, but the metric posted to JMX
> is
> > in milliseconds.  Are we sure that is not accounting for the confusion?
> >
> >
> > On Wed, Nov 20, 2013 at 12:46 PM, Shawn Hermans <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Our hbase.rpc.timeout is set for 60 seconds.  Confused as to why I
> would
> > > see such large values for the average rpc queue time.  Are there any
> > other
> > > metrics? The RPC call queue length is consistently between 150 and 200
> > > during peak usage time.  Is this normal?
> > >
> > > Regards,
> > > Shawn
> > >
> > >
> > > On Wed, Nov 20, 2013 at 11:24 AM, Jean-Marc Spaggiari <
> > > [EMAIL PROTECTED]> wrote:
> > >
> > > > But that will depend on the timeout that they have configured, right?
> > > >
> > > > I have seen some third party applications recommending to increase
> > > timeouts
> > > > to 1h30...
> > > >
> > > > JMS
> > > > Le 2013-11-20 12:08, "Vladimir Rodionov" <[EMAIL PROTECTED]> a
> > > > écrit :
> > > >
> > > > > >>The RpcQueueTime metrics are a measurement of how long individual
> > > calls
> > > > > >>stay in this queued state.  If your handlers were never 100%
> > > occupied,
> > > > > this
> > > > > >>value would be 0.  An average of 3 hours is concerning, it
> > basically
> > > > > means
> > > > > >>that when a call comes into the RegionServer it takes on average
> 3
> > > > hours
> > > > > to
> > > > > >>start processing, because handlers are all occupied for that
> amount
> > > of
> > > > > time.
> > > > >
> > > > > Definitely, this metric is meaningless because default RPC timeout
> is
> > > 60
> > > > > sec and under no circumstances
> > > > > call data can survive this 60 sec in a callQueue unless we have  a
> > bug.
> > > > >
> > > > > Best regards,
> > > > > Vladimir Rodionov
> > > > > Principal Platform Engineer
> > > > > Carrier IQ, www.carrieriq.com
> > > > > e-mail: [EMAIL PROTECTED]
> > > > >
> > > > > ________________________________________
> > > > > From: Bryan Beaudreault [[EMAIL PROTECTED]]
> > > > > Sent: Wednesday, November 20, 2013 8:55 AM
> > > > > To: [EMAIL PROTECTED]
> > > > > Subject: Re: Average RPC Queue Time
> > > > >
> > > > > A regionserver is configured with a certain number of RPC handlers
> > > > > (hbase.regionserver.handler.count).  When these handlers are all
> > > > occupied,
> > > > > the calls back up into a callQueue.  This call queue is bounded by
> > > > > ipc.server.max.callqueue.size (defaulting to 1GB of serialized
> > > requests)
> > > > > and ipc.server.max.callqueue.length (10 * numHandlers).  So, with 5
> > > > > handlers a maximum of 50 calls will be queued up before requests
> are
> > > > > rejected outright.
> > > > >
> > > > > The RpcQueueTime metrics are a measurement of how long individual
> > calls
> > > > > stay in this queued state.  If your handlers were never 100%
> > occupied,
> > > > this
> > > > > value would be 0.  An average of 3 hours is concerning, it
> basically
> > > > means
> > > > > that when a call comes into the RegionServer it takes on average 3
> > > hours
> > > > to
> > > > > start processing, because handlers are all occupied for that amount
> > of
> > > > > time.
> > > > >
> > > > > You can lower time through a few options:
> > > > >
> > > > > - Up the max number of handlers (beware using too many, as this
> just
> > > > shifts
> > > > > load to the disks, and also takes more memory)
> > > > > - Make your requests smaller (use caching or batching on a scan to