Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2


+
lars hofhansl 2012-11-04, 00:37
+
rajesh babu chintaguntla 2012-11-04, 02:10
+
Matt Corgan 2012-11-04, 02:27
+
Ted Yu 2012-11-04, 22:07
+
Matt Corgan 2012-11-05, 01:16
+
Jean-Daniel Cryans 2012-11-05, 17:52
+
Matt Corgan 2012-11-05, 22:02
+
Jean-Daniel Cryans 2012-11-05, 22:15
+
Matt Corgan 2012-11-06, 01:33
+
Ted Yu 2012-11-06, 03:07
+
Matt Corgan 2012-11-06, 05:10
Copy link to this message
-
Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
lars hofhansl 2012-11-06, 05:17
So it seems you can repeat this to some extend in 0.94.2, but you have never seen this before?
-- Lars

________________________________
 From: Matt Corgan <[EMAIL PROTECTED]>
To: dev <[EMAIL PROTECTED]>
Sent: Monday, November 5, 2012 9:10 PM
Subject: Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
 
It happened in this new table that I have all the logs for.  The region in
question this time is 6839663e4f8f79be3d7469784c21cbc2, and the first trace
of this region is on the regionserver with the "Intantiated tableName..."
message

2012-11-05 22:24:21,162 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
Instantiated
StatAreaModelLink,\x00\x00\x07\xD9\x00\x00\x00\x0C\x00\x00\x00\x004H\xC4\xB5\x00\x00\x00\x02\x00\x00\x00\x05\x00\x00\x00\x00G.l\x9B,1352172257535.6839663e4f8f79be3d74
9784c21cbc2.

I also know this region came from a recent split, but I can't find any log
messages show the parent finishing the split that created this daughter
region.  So my guess now is that the split is actually finishing and
letting clients continue to write data, but something is failing to print
the log line and correctly tell the master about the new region.

I've noticed that these regions are showing up on the clients in calls to
HTable.getRegionLocations(), so the clients continue to function, but if I
call HBaseAdmin.move() i get an UnknownRegionException.
On Mon, Nov 5, 2012 at 7:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> I think what Matt encountered was this:
> https://issues.apache.org/jira/browse/HBASE-7101
>
> On Mon, Nov 5, 2012 at 5:33 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
>
> > Beats me - i can't find any record of it before that.
> >
> > I'm importing data into another table now.  I disabled/enabled the table
> > first to make sure we have the original 4 region locations logged
> > everywhere.  Will report back...
> >
> >
> > On Mon, Nov 5, 2012 at 2:15 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Wow, and can you figure how this happened?
> > >
> > > 2012-11-05 00:24:54,538 DEBUG
> > > org.apache.hadoop.hbase.regionserver.HRegion: Instantiated
> > >
> > >
> >
> ActiveListingRecord16,\x86\x07\xDC\x03\x17RealtyTrac\x0044737383,1352093084141.22f8fa73d8af837410ca270f344f6bb8.
> > >
> > > Split? Else how did the master told that RS to open a region that
> doesn't
> > > exist?
> > >
> > > On Mon, Nov 5, 2012 at 2:02 PM, Matt Corgan <[EMAIL PROTECTED]>
> wrote:
> > > > Here's one from last night with the master logs at the bottom.
> > > > http://pastebin.com/cSdMbA2a
> > > >
> > > > I don't see that region in the master logs for 4 days back.  I'm
> going
> > to
> > > > import some new data from scratch soon and will be sure to keep all
> the
> > > > master logs.
> > > >
> > > >
> > > > On Mon, Nov 5, 2012 at 9:52 AM, Jean-Daniel Cryans <
> > [EMAIL PROTECTED]
> > > >wrote:
> > > >
> > > >> This reminds me a lot of
> > > https://issues.apache.org/jira/browse/HBASE-4792
> > > >>
> > > >> What I'd like to see is the log from the first time the master
> > > >> receives the split message. I guess it says the region doesn't exist
> > > >> anymore because the split was processed already in the master but
> > > >> there's a failure mode similar to 4792.
> > > >>
> > > >> I saw this on another cluster last week but the logs were rolling
> > > >> based on size so the original split message was lost.
> > > >>
> > > >> J-D
> > > >>
> > > >> On Sun, Nov 4, 2012 at 5:16 PM, Matt Corgan <[EMAIL PROTECTED]>
> > > wrote:
> > > >> > I've had a few successful splits today without any errors.  I'll
> > turn
> > > up
> > > >> > the importer speed tomorrow to start stressing the cluster more.
> > > >> >
> > > >> >
> > > >> > On Sun, Nov 4, 2012 at 2:07 PM, Ted Yu <[EMAIL PROTECTED]>
> wrote:
> > > >> >
> > > >> >> Matt:
> > > >> >> Was there any region server crash before you noticed the repeated
> > log
> > > >> for
> > > >> >> bc62a8a72124a4ba3f6b9f302587903c ?
> > > >> >>
> > > >> >> I wonder if the following JIRA might be related:
+
Matt Corgan 2012-11-06, 05:28
+
lars hofhansl 2012-11-06, 05:45
+
ramkrishna vasudevan 2012-11-06, 05:59
+
ramkrishna vasudevan 2012-11-06, 06:30
+
Matt Corgan 2012-11-06, 08:12
+
ramkrishna vasudevan 2012-11-06, 10:07
+
ramkrishna vasudevan 2012-11-06, 11:29
+
lars hofhansl 2012-11-06, 15:13
+
Matt Corgan 2012-11-06, 23:36
+
Matt Corgan 2012-11-05, 01:20
+
ramkrishna vasudevan 2012-11-04, 17:12