Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> RS crash upon replication


Copy link to this message
-
Re: RS crash upon replication
I believe there were cascading failures which got these deep nodes
containing still to be replicated WAL(s) - I suspect there is either some
parsing bug or something which is causing the replication source to not
work - also which version are you using - does it have
https://issues.apache.org/jira/browse/HBASE-8207 - since you use hyphens in
our paths. One way to get back up is to delete these nodes but then you
lose data in these WAL(s)...
On Wed, May 22, 2013 at 2:22 PM, Amit Mor <[EMAIL PROTECTED]> wrote:

>  va-p-hbase-02-d,60020,1369249862401
>
>
> On Thu, May 23, 2013 at 12:20 AM, Varun Sharma <[EMAIL PROTECTED]>
> wrote:
>
> > Basically
> >
> > ls /hbase/rs and what do you see for va-p-02-d ?
> >
> >
> > On Wed, May 22, 2013 at 2:19 PM, Varun Sharma <[EMAIL PROTECTED]>
> wrote:
> >
> > > Can you do ls /hbase/rs and see what you get for 02-d - instead of
> > looking
> > > in /replication/, could you look in /hbase/replication/rs - I want to
> see
> > > if the timestamps are matching or not ?
> > >
> > > Varun
> > >
> > >
> > > On Wed, May 22, 2013 at 2:17 PM, Varun Sharma <[EMAIL PROTECTED]>
> > wrote:
> > >
> > >> I see - so looks okay - there's just a lot of deep nesting in there -
> if
> > >> you look into these you nodes by doing ls - you should see a bunch of
> > >> WAL(s) which still need to be replicated...
> > >>
> > >> Varun
> > >>
> > >>
> > >> On Wed, May 22, 2013 at 2:16 PM, Varun Sharma <[EMAIL PROTECTED]
> > >wrote:
> > >>
> > >>> 2013-05-22 15:31:25,929 WARN
> > >>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly
> > transient
> > >>> ZooKeeper exception:
> > >>> org.apache.zookeeper.KeeperException$SessionExpiredException:
> > >>> KeeperErrorCode = Session expired for *
> > >>>
> >
> /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379/1-va-p-hbase-01-c,60020,1369042378287-va-p-hbase-02-c,60020,1369042377731-va-p-hbase-02-d,60020,1369233252475/va-p-hbase-01-c%2C60020%2C1369042378287.1369220050719
> > >>> *
> > >>> *
> > >>> *
> > >>> *01->[01->02->02]->01*
> > >>>
> > >>> *Looks like a bunch of cascading failures causing this deep
> nesting...
> > *
> > >>>
> > >>>
> > >>> On Wed, May 22, 2013 at 2:09 PM, Amit Mor <[EMAIL PROTECTED]
> > >wrote:
> > >>>
> > >>>> empty return:
> > >>>>
> > >>>> [zk: va-p-zookeeper-01-c:2181(CONNECTED) 10] ls
> > >>>> /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379/1
> > >>>> []
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Thu, May 23, 2013 at 12:05 AM, Varun Sharma <[EMAIL PROTECTED]
> >
> > >>>> wrote:
> > >>>>
> > >>>> > Do an "ls" not a get here and give the output ?
> > >>>> >
> > >>>> > ls /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379/1
> > >>>> >
> > >>>> >
> > >>>> > On Wed, May 22, 2013 at 1:53 PM, [EMAIL PROTECTED] <
> > >>>> > [EMAIL PROTECTED]> wrote:
> > >>>> >
> > >>>> > > [zk: va-p-zookeeper-01-c:2181(CONNECTED) 3] get
> > >>>> > > /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379/1
> > >>>> > >
> > >>>> > > cZxid = 0x60281c1de
> > >>>> > > ctime = Wed May 22 15:11:17 EDT 2013
> > >>>> > > mZxid = 0x60281c1de
> > >>>> > > mtime = Wed May 22 15:11:17 EDT 2013
> > >>>> > > pZxid = 0x60281c1de
> > >>>> > > cversion = 0
> > >>>> > > dataVersion = 0
> > >>>> > > aclVersion = 0
> > >>>> > > ephemeralOwner = 0x0
> > >>>> > > dataLength = 0
> > >>>> > > numChildren = 0
> > >>>> > >
> > >>>> > >
> > >>>> > >
> > >>>> > > On Wed, May 22, 2013 at 11:49 PM, Ted Yu <[EMAIL PROTECTED]>
> > >>>> wrote:
> > >>>> > >
> > >>>> > > > What does this command show you ?
> > >>>> > > >
> > >>>> > > > get
> /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379/1
> > >>>> > > >
> > >>>> > > > Cheers
> > >>>> > > >
> > >>>> > > > On Wed, May 22, 2013 at 1:46 PM, [EMAIL PROTECTED] <
> > >>>> > > > [EMAIL PROTECTED]> wrote:
> > >>>> > > >
> > >>>> > > > > ls /hbase/replication/rs/va-p-hbase-01-c,60020,1369249873379
> > >>>> > > > > [1]
> > >>>> > > > > [zk: va-p-zookeeper-01-c:2181(CONNECTED) 2] ls