Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Handling regionserver crashes in production cluster


Copy link to this message
-
Re: Handling regionserver crashes in production cluster
You can configure below to more value to close more regions at a time.

 <property>
    <name>hbase.regionserver.executor.closeregion.threads</name>
    <value>3</value>
  </property>
On Wed, Jun 12, 2013 at 7:38 PM, Nicolas Liochon <[EMAIL PROTECTED]> wrote:

> What was your test exactly? You killed -9 a region server but kept the
> datanode alive?
> Could you detail the queries you were doing?
>
>
> On Wed, Jun 12, 2013 at 2:10 PM, kiran <[EMAIL PROTECTED]>
> wrote:
>
> > It is not possible for us to migrate to new version immediately.
> >
> > @Anoop we purposefully brought down one regionserver, then we observed
> the
> > website is taking too much time to respond. We observed the pattern for
> > about 5 min till the regions are relocated.
> > Also we issued queries in our website taking care that the queries did
> n't
> > come under the regions in the regionserver we brought down.
> >
> > Is there any configuration workaround to mitigate it??
> >
> > Thanks
> > Kiran
> >
> >
> >
> > On Thu, Jun 6, 2013 at 8:27 PM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]
> > > wrote:
> >
> > > Hi Kiran,
> > >
> > > Also, any chance for you to migrate to 0.94.8? There have been
> > > hundreds of fixes since 0.94.1...
> > >
> > > JM
> > >
> > > 2013/6/6 Anoop John <[EMAIL PROTECTED]>:
> > > > How many total RS in the cluster?  You mean u can not do any
> operation
> > on
> > > > other regions in the live clusters?  It should not happen..  Is it so
> > > > happening that the client ops are targetted at the regions which were
> > in
> > > > the dead RS( and in transition now)?   Can u have a closer look and
> > see?
> > > > If not pls check the RS threads were they are getting blocked.
> > > >
> > > > -Anoop-
> > > >
> > > > On Wed, Jun 5, 2013 at 10:50 PM, kiran <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > >> Dear All,
> > > >>
> > > >> We have production cluster that runs on hbase 0.94.1. The issue we
> are
> > > >> facing is whenever one regionserver goes down, the cluster becomes
> > > >> unresponsive until all the regions are allocated to another
> > > >> regionserver(s). The transition is taking about 3-5 mins and during
> > this
> > > >> time we are unable to any do client operation on the cluster.
> > > >>
> > > >> Is there any way we can make the transition to run in background ?
> > > >>
> > > >> Also, it is acceptable for us if the client operations such as scan
> or
> > > get
> > > >> does not work on the rowkeys of regions in transition. But, they are
> > not
> > > >> working on the entire cluster until all the regions are moved out of
> > > >> transition. We can't afford 3-5 minutes of downtime.
> > > >>
> > > >> --
> > > >> Thank you
> > > >> Kiran Sarvabhotla
> > > >>
> > > >> -----Even a correct decision is wrong when it is taken late
> > > >>
> > >
> >
> >
> >
> > --
> > Thank you
> > Kiran Sarvabhotla
> >
> > -----Even a correct decision is wrong when it is taken late
> >
>