|
周梦想
2012-12-17, 09:51
Mohammad Tariq
2012-12-17, 14:01
周梦想
2012-12-17, 18:16
|
-
name node can't startup周梦想 2012-12-17, 09:51
hello,
I encountered a problem of hadoop 1.02. At begining, the second name node exited and can't start,it reports error like below: 2012-12-17 17:09:05,646 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1094) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1106) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1009) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:626) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1015) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:833) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:372) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) then, i shut down all process,but the name node also can't start up, and reported the same error. I consider the edits or edits.new may be corrupted, any one could give me some advice? including how to repair edits manually? Thanks! Andy zhou
-
Re: name node can't startupMohammad Tariq 2012-12-17, 14:01
Hello Andy,
Can I have a look at the NameNode logs?Also paste your config files. Best Regards, Tariq +91-9741563634 On Mon, Dec 17, 2012 at 3:21 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > Andy
-
Re: name node can't startup周梦想 2012-12-17, 18:16
the log is:
2012-12-17 17:09:05,646 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1094) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1106) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1009) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:208) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:626) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1015) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:833) at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:372) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288) core-site.xml: <configuration> <property> <name>fs.default.name</name> <value>hdfs://h46:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hbase/hadoopdata/tmp</value> </property> <property> <name>dfs.hosts.exclude</name> <value>/home/hbase/hadoop/conf/dfs.hosts.exclude</value> </property> <property> <name>local.cache.size</name> <value>10737418240</value> <description>10G</description> </property> </configuration> hdfs-site.xml <configuration> <property> <name>dfs.name.dir</name> <value>/home/hbase/hadoopdata/name</value> </property> <property> <name>dfs.data.dir</name> <value>/home/hbase/hadoopdata/data/data</value> </property> <property> ... </configuration> mapred-site.xml <configuration> <property> <name>mapred.job.tracker</name> <value>h46:9001</value> </property> <property> <name>mapred.local.dir</name> <value>/home/hbase/hadoopdata/mapred/local</value> </property> <property> <name>mapred.system.dir</name> <value>/home/hbase/hadoopdata/mapred/system</value> </property> ... </configuration> hbase-site.xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://h46:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>h46,h47,h48</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/hbase/hadoopdata/zookeeper</value> </property> <property> <name>zookeeper.session.timeout</name> <value>90000</value> </property> ... </configuration> this problem maybe caused by us to change ips of the system. I write "edits" with 0xffffffeeff, then the name node cound start up, but hbase can't list table. [hbase@h46 ~]$ hadoop fsck / ... The filesystem under path ‘/’ is CORRUPT I removed many tables by "hadoop fs -rmr", so i can list table names,but when i scan table,it didn't return. I don't know why there are so many corrupts. the .META. and -ROOT- table are all corrupt. we backup name node by 3 ways. the second name node; the nfs namenode in different node; and name node it's self. but they all didn't work,because all fs image and edits files are polluted by error data of the name node. I think hadoop should back up namenode by different directory with date or time. It's important to remain logs to history. if some one need this tool to backup name node daily in different directory, I should write one. thank you ,Mohammad Andy 2012/12/17 Mohammad Tariq <[EMAIL PROTECTED]> |