Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Never ending transtionning regions.


Copy link to this message
-
Re: Never ending transtionning regions.
Kevin O'dell 2013-02-24, 00:41
+Dev

I think number 1 we fix what ever is leaving regions in this state.  I
think we could put logic into hbck for this.

On Sat, Feb 23, 2013 at 7:36 PM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi Kevin,
>
> I stopped HBase to merge some regions so I already had to deal with the
> downtime. But with the online merge coming it's very good to know the
> online way to do it.
>
> Now, is there an automated way to do it? In HBCK? Maybe we can check each
> region if there is links, check that those links exist, and if not, we
> remove them? Or it will be to risky?
>
> JM
>
>
>
>
>
> 2013/2/23 Kevin O'dell <[EMAIL PROTECTED]>
>
> > JM,
> >
> >   Here is what I am seeing:
> >
> > 2013-02-23 15:46:14,630 ERROR
> > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed
> open
> > of
> >
> >
> region=entry,ac.adanac-oidar.www\x1Fhttp\x1F-1\x1F/sports/patinage/2012/04/04/001-artistique-trophee-mondial.shtml\x1Fnull,1361651769136.6dd77bc9ff91e0e6d413f74e670ab435.,
> > starting to roll back the global memstore size.
> >
> > If you checked 6dd77bc9ff91e0e6d413f74e670ab435 you should have seen some
> > pointer files to 2ebfef593a3d715b59b85670909182c9.  Typically, you would
> > see the storefiles in 6dd77bc9ff91e0e6d413f74e670ab435 and
> > 2ebfef593a3d715b59b85670909182c9
> > would have been empty from a bad split.  What I do is to delete the
> > pointers that don't reference any storefiles.  Then you can clear the
> > unassigned folder in zkCli.  Finally, run an unassign on the RITs.  This
> > way there is no down time and you don't have to drop any tables.
> >
> >
> > On Sat, Feb 23, 2013 at 6:14 PM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Hi Kevin,
> > >
> > > Thanks for taking the time to reply.
> > >
> > > Here is a bigger extract of the logs. I don't see another path in the
> > logs.
> > >
> > > http://pastebin.com/uMxGyjKm
> > >
> > > I can send you the entire log if you want (42Mo)
> > >
> > > What I did is I merged many regions together, then altered the table to
> > set
> > > the max_filesize and started a major_compaction to get the table
> > splitted.
> > >
> > > To fix the issue I had to drop one working table, and ran -repair
> > multiple
> > > times. Now it's fixed, but I still have the logs.
> > >
> > > I'm redoing all the steps I did. Many I will face the issue again. If
> I'm
> > > able to reproduce, we might be able to figure where the issue is...
> > >
> > > JM
> > >
> > > 2013/2/23 Kevin O'dell <[EMAIL PROTECTED]>
> > >
> > > > JM,
> > > >
> > > >   How are you doing today?  Right before the file does not exist
> should
> > > be
> > > > another path.  Can you let me know if in that path there are a
> pointers
> > > > from a split to 2ebfef593a3d715b59b85670909182c9?  The directory may
> > > > already exist.  I have seen this a couple times now and am trying to
> > > ferret
> > > > out a root cause to open a JIRA with.  I suspect we have a split code
> > bug
> > > > in .92+
> > > >
> > > > On Sat, Feb 23, 2013 at 4:10 PM, Jean-Marc Spaggiari <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I have 2 regions transitionning from servers to servers for 15
> > minutes
> > > > now.
> > > > >
> > > > > I have nothing in the master logs about those 2 regions but on the
> > > region
> > > > > server logs I have some files notfound2013-02-23 16:02:07,347 ERROR
> > > > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
> > Failed
> > > > open
> > > > > of
> > > region=entry,theykey,1361651769136.6dd77bc9ff91e0e6d413f74e670ab435.,
> > > > > starting to roll back the global memstore size.
> > > > > java.io.IOException: java.io.IOException:
> > > java.io.FileNotFoundException:
> > > > > File does not exist:
> > > > >
> > > > >
> > > >
> > >
> >
> /hbase/entry/2ebfef593a3d715b59b85670909182c9/a/62b0aae45d59408dbcfc513954efabc7
> > > > >     at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:597)

Kevin O'Dell
Customer Operations Engineer, Cloudera