|
|
+
mouradk 2012-07-30, 17:30
-
Re: Fix a corrupt edits file?Kihwal Lee 2012-07-30, 19:39
Probably the last entry is partial or is complete but not terminated
properly. You need to hexedit the file in order to correct the error. You can also pull HDFS-1378 and figure out the offset where you can put OP_INVALID (0xff). HDFS-3055 implements the interactive recovery mode, which makes it even easier. Kihwal On 7/30/12 12:30 PM, "mouradk" <[EMAIL PROTECTED]> wrote: >Hello all, > >I have just had a problem with a NameNode restart and someone on the >mailing list kindly suggested that the edits file was corrupted. I have >made a backup copy of the file and checked my >/namesecondary/previous.checkpoint but the edits file there is empty 4kb >with ????? inside. > >This suggest to me that I cannot recover from the secondaryNameNode? How >do you fix this problem? > >Thanks for your help. > >Original error log: >TARTUP_MSG: build >=https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r >911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 >************************************************************/ >2012-07-30 16:02:23,649 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: >Initializing RPC Metrics with hostName=NameNode, port=50001 >2012-07-30 16:02:23,656 INFO >org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: >localhost/127.0.0.1:50001 >2012-07-30 16:02:23,659 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: >Initializing JVM Metrics with processName=NameNode, sessionId=null >2012-07-30 16:02:23,660 INFO >org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: >Initializing NameNodeMeterics using context >object:org.apache.hadoop.metrics.spi.NullContext >2012-07-30 16:02:23,714 INFO >org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop >2012-07-30 16:02:23,714 INFO >org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup >2012-07-30 16:02:23,714 INFO >org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >isPermissionEnabled=false >2012-07-30 16:02:23,721 INFO >org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: >Initializing FSNamesystemMetrics using context >object:org.apache.hadoop.metrics.spi.NullContext >2012-07-30 16:02:23,723 INFO >org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered >FSNamesystemStatusMBean >2012-07-30 16:02:23,756 INFO >org.apache.hadoop.hdfs.server.common.Storage: Number of files = 533 >2012-07-30 16:02:23,833 INFO >org.apache.hadoop.hdfs.server.common.Storage: Number of files under >construction = 2 >2012-07-30 16:02:23,835 INFO >org.apache.hadoop.hdfs.server.common.Storage: Image file of size 55400 >loaded in 0 seconds. >2012-07-30 16:02:23,844 ERROR >org.apache.hadoop.hdfs.server.namenode.NameNode: >java.lang.NumberFormatException: For input string: "1343506" > at >java.lang.NumberFormatException.forInputString(NumberFormatException.java: >48) > at java.lang.Long.parseLong(Long.java:419) > at java.lang.Long.parseLong(Long.java:468) > at >org.apache.hadoop.hdfs.server.namenode.FSEditLog.readLong(FSEditLog.java:1 >273) > at >org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.jav >a:775) > at >org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:99 >2) > at >org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:81 >2) > at >org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSIma >ge.java:364) > at >org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory >.java:87) > at >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesyste >m.java:311) > at >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.ja >va:292) > at >org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:2 >01) > at >org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279) > at >org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.ja >va:956) > at >org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965) |