|
|
-
Re: why not hadoop backup name node data to local disk daily or hourly?Robert Dyer 2012-12-26, 14:17
I actually have this exact same error. After running my namenode for
awhile (with a snn), it gets to a point where the snn starts crashing and if I try to restart the NN I will get this problem. I typically wind up having to go with a much older copy of the image and edits files in order to get it up and running and naturally that means data loss. On Mon, Dec 24, 2012 at 8:22 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > thanks Tariq, > Now we are trying to recover data,but some data has lost forever. > > the logs just reported NULL Point Exception: > > 2012-12-17 17:09:05,646 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException > at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1094) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1106) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1009) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) > at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:626) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1015) > at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:833) > at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:372) > at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:362) > at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:496) > at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) > > We changed the source of hadoop to try catch this exception and rebuild > it, then we can start hadoop NN, but the problem of HBase remained. > so we have to upgrade the version of HBase and try to repair HBase Meta > data from Regins data. > Now we are planning to upgrade to stable version of hadoop 1.0.4 and HBase > 0.94.3. > > Best regards, > Andy > > 2012/12/24 Mohammad Tariq <[EMAIL PROTECTED]> > >> Hello Andy, >> >> I hope you are stable now :) >> >> Just a quick question. Did you find anything interesting in the NN, SNN, >> DN logs? >> >> And my grandma says, I look like Abhishek Bachchcan<http://en.wikipedia.org/wiki/Abhishek_Bacchan>;) >> >> Best Regards, >> Tariq >> +91-9741563634 >> https://mtariq.jux.com/ >> >> >> On Mon, Dec 24, 2012 at 4:24 PM, 周梦想 <[EMAIL PROTECTED]> wrote: >> >>> I stoped the Hadoop, changed every nodes' IP and configured again, and >>> started Hadoop again. Yes, we did change the IP of NN. >>> >>> >>> 2012/12/24 Nitin Pawar <[EMAIL PROTECTED]> >>> >>>> what do you mean by this "We changed all IPs of the Hadoop System" >>>> >>>> You changed the IPs of the nodes in one go? or you retired nodes one by >>>> one and changed IPs and brought them back in rotation? Also did you change >>>> IP of your NN as well ? >>>> >>>> >>>> >>>> On Mon, Dec 24, 2012 at 4:10 PM, 周梦想 <[EMAIL PROTECTED]> wrote: >>>> >>>>> Actually the problem was beggining at SecondNameNode. We changed all >>>>> IPs of the Hadoop System >>>> >>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> |