Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> AssignmentManager looping?


Copy link to this message
-
Re: AssignmentManager looping?
Jimmy,

  Sounds like our dreaded reference file issue again. I spoke with JM and
he is going to try to reproduce this  My gut tells me our point of no
return may be in the wrong place due to some code change along the way, but
hbck could also just be doing something wonky.

JM,

  This cluster is not CM managed correct?
On Aug 1, 2013 1:49 PM, "Jean-Marc Spaggiari" <[EMAIL PROTECTED]>
wrote:

> So I had to remove few reference files and run few hbck to get everything
> back online.
>
> Summary: don't stop your cluster while it's major compacting huge tables ;)
>
> Thanks all!
>
> JM
>
> 2013/8/1 Kevin O'dell <[EMAIL PROTECTED]>
>
> > If that doesn't work you probably have an invalid reference file and you
> > will find that in RS logs for the HLog split that is never finishing.
> > On Aug 1, 2013 1:38 PM, "Kevin O'dell" <[EMAIL PROTECTED]> wrote:
> >
> > > JM,
> > >
> > > Stop HBase
> > > rmr /hbase from zkcli
> > > Sideline META
> > > Run offline meta repair
> > > Start HBase
> > > On Aug 1, 2013 1:01 PM, "Jean-Marc Spaggiari" <[EMAIL PROTECTED]
> >
> > > wrote:
> > >
> > >> Hi Jimmy,
> > >>
> > >> I should still have all the logs.
> > >>
> > >> What I did is pretty simple.
> > >>
> > >> I tried to turn the cluster off while a single regioned 250GB table
> was
> > >> under major_compaction to get splitted.
> > >>
> > >> I will targz all the logs for the few last days and make that
> available.
> > >>
> > >> On the other side, I'm still not able to bring it back up...
> > >>
> > >> JM
> > >>
> > >> 2013/8/1 Jimmy Xiang <[EMAIL PROTECTED]>
> > >>
> > >> > Something went wrong with split.  It should be easy to fix your
> > cluster.
> > >> > However, it will be more interesting to find out how it happened. Do
> > you
> > >> > remember what has happened since it was good previously? Do you have
> > all
> > >> > the logs?
> > >> >
> > >> >
> > >> > On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari <
> > >> > [EMAIL PROTECTED]
> > >> > > wrote:
> > >> >
> > >> > > I tried to remove the znodes but got the same result. So I shutted
> > >> down
> > >> > all
> > >> > > the RS and restarted HBase, and now I have 0 regions for this
> table.
> > >> > > Running HBCK. Seems that it has a lot to do...
> > >> > >
> > >> > > 2013/8/1 Kevin O'dell <[EMAIL PROTECTED]>
> > >> > >
> > >> > > > Yes you can if HBase is down, first I would copy .META out of
> HDFS
> > >> > local
> > >> > > > and then you can search it for split issues. Deleting those
> znodes
> > >> > should
> > >> > > > clear this up though.
> > >> > > > On Aug 1, 2013 8:52 AM, "Jean-Marc Spaggiari" <
> > >> [EMAIL PROTECTED]
> > >> > >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > I can't check the meta since HBase is down.
> > >> > > > >
> > >> > > > > Regarding HDFS, I took few random lines like:
> > >> > > > > 2013-08-01 08:45:57,260 WARN
> > >> > > > > org.apache.hadoop.hbase.master.AssignmentManager: Region
> > >> > > > > 28328fdb7181cbd9cc4d6814775e8895 not found on server
> > >> > > > > node4,60020,1375319042033; failed processing
> > >> > > > > 2013-08-01 08:45:57,260 WARN
> > >> > > > > org.apache.hadoop.hbase.master.AssignmentManager: Received
> SPLIT
> > >> for
> > >> > > > region
> > >> > > > > 28328fdb7181cbd9cc4d6814775e8895 from server
> > >> > node4,60020,1375319042033
> > >> > > > but
> > >> > > > > it doesn't exist anymore, probably already processed its split
> > >> > > > >
> > >> > > > > And each time, there is nothing like that.
> > >> > > > > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
> > >> > > > > 28328fdb7181cbd9cc4d6814775e8895
> > >> > > > >
> > >> > > > > On ZK side:
> > >> > > > > [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
> > >> > > > >
> > >> > > > > [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
> > >> > > > > [28328fdb7181cbd9cc4d6814775e8895,
> > >> a8781a598c46f19723a2405345b58470,
> > >> > > > > b7ebfeb63b10997736fd12920fde2bb8,
> > >> d95bb27cc026511c2a8c8ad155e79bf6,
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB