Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - RPC - Queue Time when handlers are all waiting


Copy link to this message
-
Re: RPC - Queue Time when handlers are all waiting
Federico Gaule 2013-12-10, 12:05
I've increased hbase.regionserver.replication.handler.count 10x (30) but
nothing have changed. rpc.metrics.RpcQueueTime_avg_time still shows
activity :(

Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 29 on 60000WAITING
(since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins,
56sec ago)Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 28 on
60000WAITING
(since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins,
56sec ago)Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 27 on
60000WAITING
(since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins,
56sec ago)Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 26 on
60000WAITING
(since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins,
56sec ago)......
...
...
Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 2 on 60000WAITING
(since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins,
56sec ago)Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 1 on 60000WAITING
(since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins,
56sec ago)Mon Dec 09 14:04:10 EST 2013REPL IPC Server handler 0 on 60000WAITING
(since 16hrs, 58mins, 56sec ago)Waiting for a call (since 16hrs, 58mins,
56sec ago)
Thanks JM
2013/12/9 Jean-Marc Spaggiari <[EMAIL PROTECTED]>

> Yes, default value is 3 in 0.94.14. If you have not changed it, then it's
> still 3.
>
> conf.getInt("hbase.regionserver.replication.handler.count", 3);
>
> Keep us posted on the results.
>
> JM
>
>
> 2013/12/9 Federico Gaule <[EMAIL PROTECTED]>
>
> > Default value for hbase.regionserver.replication.handler.count (can't
> find
> > what is the default, Is it 3?)
> > I'll do a try increasing that property
> >
> > Fri Dec 06 12:44:12 EST 2013REPL IPC Server handler 2 on 60020WAITING
> > (since 8sec ago)Waiting for a call (since 8sec ago)Fri Dec 06 12:44:12
> EST
> > 2013REPL IPC Server handler 1 on 60020WAITING (since 8sec ago)Waiting
> for a
> > call (since 8sec ago)Fri Dec 06 12:44:12 EST 2013REPL IPC Server handler
> 0
> > on 60020WAITING (since 2sec ago)Waiting for a call (since 2sec ago)
> > Thanks JM
> >
> >
> > 2013/12/9 Jean-Marc Spaggiari <[EMAIL PROTECTED]>
> >
> > > For replications, the handlers used on the salve cluster are configured
> > by
> > > hbase.regionserver.replication.handler.count. What value do you have
> for
> > > this property?
> > >
> > > JM
> > >
> > >
> > > 2013/12/9 Federico Gaule <[EMAIL PROTECTED]>
> > >
> > > > Here is a thread saying what i think it should be (
> > > > http://grokbase.com/t/hbase/user/13bmndq53k/average-rpc-queue-time)
> > > >
> > > > "The RpcQueueTime metrics are a measurement of how long individual
> > calls
> > > > stay in this queued state. If your handlers were never 100% occupied,
> > > this
> > > > value would be 0. An average of 3 hours is concerning, it basically
> > means
> > > > that when a call comes into the RegionServer it takes on average 3
> > hours
> > > to
> > > > start processing, because handlers are all occupied for that amount
> of
> > > > time."
> > > >
> > > > Is that correct?
> > > >
> > > >
> > > >
> > > > 2013/12/9 Federico Gaule <[EMAIL PROTECTED]>
> > > >
> > > > > Correct me if i'm wrong, but, Queues should be used only when
> > handlers
> > > > are
> > > > > all busy, shouldn't it?.
> > > > > If that's true, i don't get why there is activity related to
> queues.
> > > > >
> > > > > Maybe i'm missing some piece of knowledge about when hbase is using
> > > > queues
> > > > > :)
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > 2013/12/9 Jean-Marc Spaggiari <[EMAIL PROTECTED]>
> > > > >
> > > > >> There might be something I'm missing ;)
> > > > >>
> > > > >> On cluster B, as you said, never more than 50% of your handlers
> are
> > > > used.
> > > > >> Your Ganglia metrics are showing that there is activities (num ops
> > is
> > > > >> increasing), which is correct.
> > > > >>
> > > > >> Can you please confirm what you think is wrong from your charts?
[image: http://www.despegar.com/galeria/images/promos/isodespegar1.png]

*Ing. Federico Gaule*
Líder Técnico - PAM <[EMAIL PROTECTED]>
Av. Corrientes 746 - Piso 9 - C.A.B.A. (C1043AAU)
tel. +54 (11) 4894-3500

*[image: Seguinos en Twitter!] <http://twitter.com/#!/despegarar> [image:
Seguinos en Facebook!] <http://www.facebook.com/despegar> [image: Seguinos
en YouTube!] <http://www.youtube.com/Despegar>*
*Despegar.com, Inc. *
El mejor precio para tu viaje.

Este mensaje es confidencial y puede contener información amparada por el
secreto profesional.
Si usted ha recibido este e-mail por error, por favor comuníquenoslo
inmediatamente respondiendo a este e-mail y luego eliminándolo de su
sistema.
El contenido de este mensaje no deberá ser copiado ni divulgado a ninguna
persona.