|
|
-
Re: Never ending transtionning regions.Kevin O'dell 2013-02-24, 00:41
+Dev
I think number 1 we fix what ever is leaving regions in this state. I think we could put logic into hbck for this. On Sat, Feb 23, 2013 at 7:36 PM, Jean-Marc Spaggiari < [EMAIL PROTECTED]> wrote: > Hi Kevin, > > I stopped HBase to merge some regions so I already had to deal with the > downtime. But with the online merge coming it's very good to know the > online way to do it. > > Now, is there an automated way to do it? In HBCK? Maybe we can check each > region if there is links, check that those links exist, and if not, we > remove them? Or it will be to risky? > > JM > > > > > > 2013/2/23 Kevin O'dell <[EMAIL PROTECTED]> > > > JM, > > > > Here is what I am seeing: > > > > 2013-02-23 15:46:14,630 ERROR > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed > open > > of > > > > > region=entry,ac.adanac-oidar.www\x1Fhttp\x1F-1\x1F/sports/patinage/2012/04/04/001-artistique-trophee-mondial.shtml\x1Fnull,1361651769136.6dd77bc9ff91e0e6d413f74e670ab435., > > starting to roll back the global memstore size. > > > > If you checked 6dd77bc9ff91e0e6d413f74e670ab435 you should have seen some > > pointer files to 2ebfef593a3d715b59b85670909182c9. Typically, you would > > see the storefiles in 6dd77bc9ff91e0e6d413f74e670ab435 and > > 2ebfef593a3d715b59b85670909182c9 > > would have been empty from a bad split. What I do is to delete the > > pointers that don't reference any storefiles. Then you can clear the > > unassigned folder in zkCli. Finally, run an unassign on the RITs. This > > way there is no down time and you don't have to drop any tables. > > > > > > On Sat, Feb 23, 2013 at 6:14 PM, Jean-Marc Spaggiari < > > [EMAIL PROTECTED]> wrote: > > > > > Hi Kevin, > > > > > > Thanks for taking the time to reply. > > > > > > Here is a bigger extract of the logs. I don't see another path in the > > logs. > > > > > > http://pastebin.com/uMxGyjKm > > > > > > I can send you the entire log if you want (42Mo) > > > > > > What I did is I merged many regions together, then altered the table to > > set > > > the max_filesize and started a major_compaction to get the table > > splitted. > > > > > > To fix the issue I had to drop one working table, and ran -repair > > multiple > > > times. Now it's fixed, but I still have the logs. > > > > > > I'm redoing all the steps I did. Many I will face the issue again. If > I'm > > > able to reproduce, we might be able to figure where the issue is... > > > > > > JM > > > > > > 2013/2/23 Kevin O'dell <[EMAIL PROTECTED]> > > > > > > > JM, > > > > > > > > How are you doing today? Right before the file does not exist > should > > > be > > > > another path. Can you let me know if in that path there are a > pointers > > > > from a split to 2ebfef593a3d715b59b85670909182c9? The directory may > > > > already exist. I have seen this a couple times now and am trying to > > > ferret > > > > out a root cause to open a JIRA with. I suspect we have a split code > > bug > > > > in .92+ > > > > > > > > On Sat, Feb 23, 2013 at 4:10 PM, Jean-Marc Spaggiari < > > > > [EMAIL PROTECTED]> wrote: > > > > > > > > > Hi, > > > > > > > > > > I have 2 regions transitionning from servers to servers for 15 > > minutes > > > > now. > > > > > > > > > > I have nothing in the master logs about those 2 regions but on the > > > region > > > > > server logs I have some files notfound2013-02-23 16:02:07,347 ERROR > > > > > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: > > Failed > > > > open > > > > > of > > > region=entry,theykey,1361651769136.6dd77bc9ff91e0e6d413f74e670ab435., > > > > > starting to roll back the global memstore size. > > > > > java.io.IOException: java.io.IOException: > > > java.io.FileNotFoundException: > > > > > File does not exist: > > > > > > > > > > > > > > > > > > > > /hbase/entry/2ebfef593a3d715b59b85670909182c9/a/62b0aae45d59408dbcfc513954efabc7 > > > > > at > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:597) Kevin O'Dell Customer Operations Engineer, Cloudera |