-Re: How to manage retry failures in the HBase client
Ted Yu 2013-09-17, 18:00
bq. What is the formal way to request a specific documentation change?
Once suggested change is acknowledged, JIRA can be opened where you attach
bq. Do I need to sign a contributor agreement?
I don't think so.
On Tue, Sep 17, 2013 at 10:48 AM, Tom Brown <[EMAIL PROTECTED]> wrote:
> I had read that section for those values, but it was unclear (the
> hbase.client.retries.number description subtly switches to describe
> hbase.client.pause, and I missed that context switch).
> If I could make a recommendation as to changing those items descriptions, I
> would rearrange it like so:
> General client pause value. Used mostly as value to wait before running a
> retry of a failed get, region lookup, etc. The actual retry interval is a
> rough function based on this setting. At first we retry at this interval
> but then with backoff, we pretty quickly reach retrying every ten seconds.
> See HConstants#RETRY_BACKOFF for how the backup ramps up.
> Default: 100
> Maximum retries. Used as maximum for all retryable operations such as the
> getting of a cell's value, starting a row update, etc. Change this setting
> and hbase.client.pause to suit your workload.
> Default: 35
> What is the formal way to request a specific documentation change? Do I
> need to sign a contributor agreement?
> On Tue, Sep 17, 2013 at 11:40 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > Have you looked at
> > http://hbase.apache.org/book.html#hbase_default_configurations where
> > hbase.client.retries.number
> > and hbase.client.pause are explained ?
> > Cheers
> > On Tue, Sep 17, 2013 at 10:34 AM, Tom Brown <[EMAIL PROTECTED]>
> > > I have a region-server coprocessor that scans it's portion of a table
> > based
> > > on a request and summarizes the results (designed this way to reduce
> > > network data transfer).
> > >
> > > In certain circumstances, the HBase cluster gets a bit overloaded, and
> > > query will take too long. In that instance, the HBase client will retry
> > the
> > > query (up to N times). When this happens, any other running queries
> > > often timeout and generate retries as well. This results in the cluster
> > > becoming unresponsive, until I'm able to kill the clients that are
> > retrying
> > > their requests.
> > >
> > > I have found the "hbase.client.retries.number" property, but that
> > > claim to set the number of retries, rather the amount of time between
> > > retries. Is there a different property I can use to set the maximum
> > number
> > > of retries? Or is this property mis-documented?
> > >
> > > Thanks in advance!
> > >
> > > --Tom
> > >