Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> All region server died due to "Parent directory doesn't exist"


Copy link to this message
-
Re: All region server died due to "Parent directory doesn't exist"
Thanks Varun for sharing your experience.

Lars:
Was the server carrying .META. functioning properly around the time when
you observed the problem ?

Cheers

On Thu, May 9, 2013 at 9:41 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> I meant no NTP/clock synchronization b/w zookeeper quorum and the HBase
> cluster. I am not sure if you are seeing the exact same issue though. We
> did not have mass failures at the same time due to this..
>
> Thanks
> Varun
>
>
> On Thu, May 9, 2013 at 9:39 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:
>
> > Btw, I am not 100 % sure but I have some seen something like this before:
> >
> > 1) ZK connection flakiness causes ephemeral nodes to expire
> > 2) Master detects failure and renames the logs into a splitting directory
> > - this is intentional so that in case that region server comes back up,
> it
> > cannot write to the logs being split
> > 3) Region server dies because the log is renamed
> >
> > So, the yanking away of files is done by the HBase master and is expected
> > if the master feels the server is dead. We found that the Region server
> > logs DFS exceptions like crazy (1000s of them) in that case and we always
> > suspected that this is some kind of DFS error but when we really go upto
> > the point where it started, we found some zookeeper session issues.
> >
> > We had two cases of this - either super high load or NTP/no clock
> > synchronization b/w the clusters causing this issue for us.
> >
> > Thanks
> > Varun
> >
> >
> > On Thu, May 9, 2013 at 9:16 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> >
> >> Thanks Ted. I'll do the same.
> >>
> >>
> >> ----- Original Message -----
> >> From: Ted Yu <[EMAIL PROTECTED]>
> >> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> >> Cc:
> >> Sent: Thursday, May 9, 2013 9:07 AM
> >> Subject: Re: All region server died due to "Parent directory doesn't
> >> exist"
> >>
> >> I went through the patch for HBASE-7824 one more time and didn't find
> >> direct correlation to the issue Lars reported.
> >>
> >> I am going over the other JIRAs in Lars' list.
> >>
> >> Cheers
> >>
> >> On Thu, May 9, 2013 at 8:48 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> >>
> >> > I will try. I do not think this is the issue, though.
> >> >
> >> > The master is up in my case.
> >> > Right now the cluster is in a state where each region server aborts
> >> itself
> >> > shortly after being started (which coincides with having it's log
> >> directory
> >> > renamed to ...-splitting).
> >> >
> >> >
> >> > This is a test cluster and I could just start from scratch... This
> >> appears
> >> > to be a serious enough problem, though, and I would like to track down
> >> the
> >> > issue.
> >> >
> >> > -- Lars
> >> >
> >> >
> >> >
> >> > ----- Original Message -----
> >> > From: Ted Yu <[EMAIL PROTECTED]>
> >> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> >> > Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> >> > Sent: Thursday, May 9, 2013 2:04 AM
> >> > Subject: Re: All region server died due to "Parent directory doesn't
> >> exist"
> >> >
> >> > The config came from hbase-7824.
> >> >
> >> > There are other JIRAs in Lars' list which are related to log
> splitting.
> >> >
> >> > I think more investigation is needed.
> >> >
> >> > Cheers
> >> >
> >> > On May 9, 2013, at 1:59 AM, Andrew Purtell <[EMAIL PROTECTED]>
> wrote:
> >> >
> >> > > So that is HBASE-7824, right?
> >> > >
> >> > > On Thu, May 9, 2013 at 4:33 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >> > >
> >> > >> hbase.master.wait.for.log.splitting
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Best regards,
> >> > >
> >> > >   - Andy
> >> > >
> >> > > Problems worthy of attack prove their worth by hitting back. - Piet
> >> Hein
> >> > > (via Tom White)
> >> >
> >> >
> >>
> >>
> >
>