Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2


Copy link to this message
-
Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
ramkrishna vasudevan 2012-11-06, 06:30
The log shows that the first time the region was transitioned to SPLITTING
even then it was not populated in the Master's memory.

On Tue, Nov 6, 2012 at 11:29 AM, ramkrishna vasudevan <
[EMAIL PROTECTED]> wrote:

> Could you attach the master logs at this time
> 2012-11-05 00:24:55?
>
> Regards
> Ram
>
> On Tue, Nov 6, 2012 at 11:15 AM, lars hofhansl <[EMAIL PROTECTED]>wrote:
>
>> Took a brief look through all SPLIT related commits since 0.94.0... Found
>> these:
>>
>> HBASE-6854 *
>> HBASE-6713
>> HBASE-6329 *
>>
>> HBASE-6088
>>
>> HBASE-5986
>> HBASE-6070 *
>>
>>
>> The ones marked with * are (IMHO) more likely to be related.
>>
>> -- Lars
>>
>> ________________________________
>> From: Matt Corgan <[EMAIL PROTECTED]>
>> To: dev <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]>
>> Sent: Monday, November 5, 2012 9:28 PM
>> Subject: Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
>>
>> Yeah - we were running .94.0 since it came out but never saw it there.
>> I'll keep trying to narrow it down.  The only harm it's causing is log
>> spam and failing to move daughters to a new regionserver, which are
>> definitely problems, but it's not bringing down the cluster.
>>
>>
>> On Mon, Nov 5, 2012 at 9:17 PM, lars hofhansl <[EMAIL PROTECTED]>
>> wrote:
>>
>> > So it seems you can repeat this to some extend in 0.94.2, but you have
>> > never seen this before?
>> >
>> >
>> > -- Lars
>> >
>> >
>> >
>> > ________________________________
>> >  From: Matt Corgan <[EMAIL PROTECTED]>
>> > To: dev <[EMAIL PROTECTED]>
>> > Sent: Monday, November 5, 2012 9:10 PM
>> > Subject: Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
>> >
>> > It happened in this new table that I have all the logs for.  The region
>> in
>> > question this time is 6839663e4f8f79be3d7469784c21cbc2, and the first
>> trace
>> > of this region is on the regionserver with the "Intantiated
>> tableName..."
>> > message
>> >
>> > 2012-11-05 22:24:21,162 DEBUG
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> > Instantiated
>> >
>> >
>> StatAreaModelLink,\x00\x00\x07\xD9\x00\x00\x00\x0C\x00\x00\x00\x004H\xC4\xB5\x00\x00\x00\x02\x00\x00\x00\x05\x00\x00\x00\x00G.l\x9B,1352172257535.6839663e4f8f79be3d74
>> > 9784c21cbc2.
>> >
>> > I also know this region came from a recent split, but I can't find any
>> log
>> > messages show the parent finishing the split that created this daughter
>> > region.  So my guess now is that the split is actually finishing and
>> > letting clients continue to write data, but something is failing to
>> print
>> > the log line and correctly tell the master about the new region.
>> >
>> > I've noticed that these regions are showing up on the clients in calls
>> to
>> > HTable.getRegionLocations(), so the clients continue to function, but
>> if I
>> > call HBaseAdmin.move() i get an UnknownRegionException.
>> >
>> >
>> > On Mon, Nov 5, 2012 at 7:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>> >
>> > > I think what Matt encountered was this:
>> > > https://issues.apache.org/jira/browse/HBASE-7101
>> > >
>> > > On Mon, Nov 5, 2012 at 5:33 PM, Matt Corgan <[EMAIL PROTECTED]>
>> wrote:
>> > >
>> > > > Beats me - i can't find any record of it before that.
>> > > >
>> > > > I'm importing data into another table now.  I disabled/enabled the
>> > table
>> > > > first to make sure we have the original 4 region locations logged
>> > > > everywhere.  Will report back...
>> > > >
>> > > >
>> > > > On Mon, Nov 5, 2012 at 2:15 PM, Jean-Daniel Cryans <
>> > [EMAIL PROTECTED]
>> > > > >wrote:
>> > > >
>> > > > > Wow, and can you figure how this happened?
>> > > > >
>> > > > > 2012-11-05 00:24:54,538 DEBUG
>> > > > > org.apache.hadoop.hbase.regionserver.HRegion: Instantiated
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ActiveListingRecord16,\x86\x07\xDC\x03\x17RealtyTrac\x0044737383,1352093084141.22f8fa73d8af837410ca270f344f6bb8.
>> > > > >
>> > > > > Split? Else how did the master told that RS to open a region that
>> > > doesn't
>> > > > > exist?