Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2


Copy link to this message
-
Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
Jean-Daniel Cryans 2012-11-05, 22:15
Wow, and can you figure how this happened?

2012-11-05 00:24:54,538 DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Instantiated
ActiveListingRecord16,\x86\x07\xDC\x03\x17RealtyTrac\x0044737383,1352093084141.22f8fa73d8af837410ca270f344f6bb8.

Split? Else how did the master told that RS to open a region that doesn't exist?

On Mon, Nov 5, 2012 at 2:02 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
> Here's one from last night with the master logs at the bottom.
> http://pastebin.com/cSdMbA2a
>
> I don't see that region in the master logs for 4 days back.  I'm going to
> import some new data from scratch soon and will be sure to keep all the
> master logs.
>
>
> On Mon, Nov 5, 2012 at 9:52 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:
>
>> This reminds me a lot of https://issues.apache.org/jira/browse/HBASE-4792
>>
>> What I'd like to see is the log from the first time the master
>> receives the split message. I guess it says the region doesn't exist
>> anymore because the split was processed already in the master but
>> there's a failure mode similar to 4792.
>>
>> I saw this on another cluster last week but the logs were rolling
>> based on size so the original split message was lost.
>>
>> J-D
>>
>> On Sun, Nov 4, 2012 at 5:16 PM, Matt Corgan <[EMAIL PROTECTED]> wrote:
>> > I've had a few successful splits today without any errors.  I'll turn up
>> > the importer speed tomorrow to start stressing the cluster more.
>> >
>> >
>> > On Sun, Nov 4, 2012 at 2:07 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>> >
>> >> Matt:
>> >> Was there any region server crash before you noticed the repeated log
>> for
>> >> bc62a8a72124a4ba3f6b9f302587903c ?
>> >>
>> >> I wonder if the following JIRA might be related:
>> >> HBASE-7083 SSH#fixupDaughter should force re-assign missing daughter
>> >>
>> >> Thanks
>> >>
>> >> On Sat, Nov 3, 2012 at 7:27 PM, Matt Corgan <[EMAIL PROTECTED]>
>> wrote:
>> >>
>> >> > Strangely, I don't see any record of that region in the master before
>> >> what
>> >> > I already pasted even though I have logs back to 10/30.  Next time it
>> >> > happens I'll gather a full log record and try to debug while it's
>> >> > occurring.
>> >> >
>> >> >
>> >> > On Sat, Nov 3, 2012 at 7:10 PM, rajesh babu chintaguntla <
>> >> > [EMAIL PROTECTED]> wrote:
>> >> >
>> >> > > Hi Matt,
>> >> > > can you paste some more master logs of region
>> >> > > bc62a8a72124a4ba3f6b9f30258790 before split.
>> >> > > I think Its not problem with splitting.
>> >> > > We are getting
>> >> > >       LOG.warn("Region " + encodedName + " not found on server " +
>> >> > > serverName +
>> >> > >         "; failed processing");
>> >> > > this log means no entry in servers map(not fully assigned).
>> >> > >     Set<HRegionInfo> hris = this.servers.get(sn);
>> >> > >     HRegionInfo foundHri = null;
>> >> > >     for (HRegionInfo hri: hris) {
>> >> > >       if (hri.getEncodedName().equals(encodedName)) {
>> >> > >         foundHri = hri;
>> >> > >         break;
>> >> > >       }
>> >> > >     }
>> >> > >     return foundHri;
>> >> > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > On Sun, Nov 4, 2012 at 6:07 AM, lars hofhansl <[EMAIL PROTECTED]>
>> >> > wrote:
>> >> > >
>> >> > > > CC'ing dev list...
>> >> > > >
>> >> > > > Is anybody aware of any changes that went in recently that could
>> >> cause
>> >> > > > this?
>> >> > > > I looked around a bit, but could not find anything obvious.
>> >> > > >
>> >> > > > -- Lars
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > ________________________________
>> >> > > >  From: Matt Corgan <[EMAIL PROTECTED]>
>> >> > > > To: user <[EMAIL PROTECTED]>
>> >> > > > Sent: Saturday, November 3, 2012 5:27 PM
>> >> > > > Subject: Re: infinite loop of RS_ZK_REGION_SPLIT on .94.2
>> >> > > >
>> >> > > > I think the cluster is ok without running hbck, as restarting the
>> >> > > > regionserver process stops the warnings and everything looks ok
>> >> > > otherwise.
>> >> > > >
>> >> > > > here's the regionserver right after the split happens: