Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - RPC - Queue Time when handlers are all waiting


Copy link to this message
-
Re: RPC - Queue Time when handlers are all waiting
Federico Gaule 2013-12-09, 15:07
Here is a thread saying what i think it should be (
http://grokbase.com/t/hbase/user/13bmndq53k/average-rpc-queue-time)

"The RpcQueueTime metrics are a measurement of how long individual calls
stay in this queued state. If your handlers were never 100% occupied, this
value would be 0. An average of 3 hours is concerning, it basically means
that when a call comes into the RegionServer it takes on average 3 hours to
start processing, because handlers are all occupied for that amount of
time."

Is that correct?

2013/12/9 Federico Gaule <[EMAIL PROTECTED]>

> Correct me if i'm wrong, but, Queues should be used only when handlers are
> all busy, shouldn't it?.
> If that's true, i don't get why there is activity related to queues.
>
> Maybe i'm missing some piece of knowledge about when hbase is using queues
> :)
>
> Thanks
>
>
> 2013/12/9 Jean-Marc Spaggiari <[EMAIL PROTECTED]>
>
>> There might be something I'm missing ;)
>>
>> On cluster B, as you said, never more than 50% of your handlers are used.
>> Your Ganglia metrics are showing that there is activities (num ops is
>> increasing), which is correct.
>>
>> Can you please confirm what you think is wrong from your charts?
>>
>> Thanks,
>>
>> JM
>>
>>
>> 2013/12/9 Federico Gaule <[EMAIL PROTECTED]>
>>
>> > Hi JM,
>> > Cluster B is only receiving replication data (writes), but handlers are
>> > waiting most of the time (never 50% of them are used). As i have read,
>> RPC
>> > queue is only used when handlers are all waiting, does it count for
>> > replication as well?
>> >
>> > Thanks!
>> >
>> >
>> > 2013/12/9 Jean-Marc Spaggiari <[EMAIL PROTECTED]>
>> >
>> > > Hi,
>> > >
>> > > When you say that B doesn't get any read/write operation, does it mean
>> > you
>> > > stopped the replication? Or B is still getting the write operations
>> from
>> > A
>> > > because of the replication? If so, that's why you RPC queue is used...
>> > >
>> > > JM
>> > >
>> > >
>> > > 2013/12/9 Federico Gaule <[EMAIL PROTECTED]>
>> > >
>> > > > Not much information in RS logs (DEBUG level set to
>> > > > org.apache.hadoop.hbase). Here is a sample of one regionserver
>> showing
>> > > > increasing rpc.metrics.RpcQueueTime_num_ops and
>> > > > rpc.metrics.RpcQueueTime_avg_time
>> > > > activity:
>> > > >
>> > > > 2013-12-09 08:09:10,699 DEBUG
>> > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=23.14
>> MB,
>> > > > free=2.73 GB, max=2.75 GB, blocks=0, accesses=122442151,
>> > hits=122168501,
>> > > > hitRatio=99.77%, , cachingAccesses=122192927, cachingHits=122162378,
>> > > > cachingHitsRatio=99.97%, , evictions=0, evicted=6768,
>> > > > evictedPerRun=Infinity
>> > > > 2013-12-09 08:09:11,396 INFO
>> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink:
>> Total
>> > > > replicated: 1
>> > > > 2013-12-09 08:09:14,979 INFO
>> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink:
>> Total
>> > > > replicated: 2
>> > > > 2013-12-09 08:09:16,016 INFO
>> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink:
>> Total
>> > > > replicated: 1
>> > > > ...
>> > > > 2013-12-09 08:14:07,659 INFO
>> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink:
>> Total
>> > > > replicated: 1
>> > > > 2013-12-09 08:14:08,713 INFO
>> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink:
>> Total
>> > > > replicated: 3
>> > > > 2013-12-09 08:14:10,699 DEBUG
>> > > > org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=23.14
>> MB,
>> > > > free=2.73 GB, max=2.75 GB, blocks=0, accesses=122442151,
>> > hits=122168501,
>> > > > hitRatio=99.77%, , cachingAccesses=122192927, cachingHits=122162378,
>> > > > cachingHitsRatio=99.97%, , evictions=0, evicted=6768,
>> > > > evictedPerRun=Infinity
>> > > > 2013-12-09 08:14:12,711 INFO
>> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink:
>> Total
>> > > > replicated: 1
>> > > > 2013-12-09 08:14:14,778 INFO
>> > > > org.apache.hadoop.hbase.replication.regionserver.ReplicationSink:
[image: http://www.despegar.com/galeria/images/promos/isodespegar1.png]

*Ing. Federico Gaule*
Líder Técnico - PAM <[EMAIL PROTECTED]>
Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU)
tel. +54 (11) 4894-3500

*[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> [image:
Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: Seguinos
en YouTube!] <http://www.youtube.com/Despegar>*
*Despegar.com, Inc. *
El mejor precio para tu viaje.

Este mensaje es confidencial y puede contener información amparada por el
secreto profesional.
Si usted ha recibido este e-mail por error, por favor comuníquenoslo
inmediatamente respondiendo a este e-mail y luego eliminándolo de su
sistema.
El contenido de este mensaje no deberá ser copiado ni divulgado a ninguna
persona.