Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> RE: namenode not starting


Copy link to this message
-
RE: namenode not starting
Abhay,
  Sounds like your namenode cannot find the metadata information it needs to start (the <path>/current | image | *checppints etc)

  Basically, if you cannot locate that data locally or on your NFS Server,  your cluster is busted.

  But, let's us be optimistic about this.

 There is a chance that your NFS Server is down or the path mounted is lost.

  If it is NFS mounted (as you suggested) check that your host still have that path mounted. (from the proper NFS Server)
  ( [shell] mount ) can tell.
  * obviously if you originally mounted from foo:/mydata  and now do bar:/mydata /    you'll need to do some digging to find which NFS server it was writing to before.

 Failing to locate your namenode metadata (locally or on any of your NFS Server)  either because the NFS Server decided to become a blackhole, or some<one|thing> removed it.

  And you don't have a backup of your namenode (tape or Secondary Namenode),  
  I think you are in a world of hurt there.

  In theory you can read the blocks on the DN and try to recover some of your data (assume not in CODEC / compressed) .
Humm.. anyone knows about recovery services? (^^)

-----Original Message-----
From: Håvard Wahl Kongsgård [mailto:[EMAIL PROTECTED]]
Sent: Friday, August 24, 2012 5:38 AM
To: [EMAIL PROTECTED]
Subject: Re: namenode not starting

You should start with a reboot of the system.

A lesson to everyone, this is exactly why you should have a secondary name node (http://wiki.apache.org/hadoop/FAQ#What_is_the_purpose_of_the_secondary_name-node.3F)
and run the namenode a mirrored RAID-5/10 disk.
-Håvard

On Fri, Aug 24, 2012 at 9:40 AM, Abhay Ratnaparkhi <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I was using cluster for long time and not formatted the namenode.
> I ran bin/stop-all.sh and bin/start-all.sh scripts only.
>
> I am using NFS for dfs.name.dir.
> hadoop.tmp.dir is a /tmp directory. I've not restarted the OS.  Any
> way to recover the data?
>
> Thanks,
> Abhay
>
>
> On Fri, Aug 24, 2012 at 1:01 PM, Bejoy KS <[EMAIL PROTECTED]> wrote:
>>
>> Hi Abhay
>>
>> What is the value for hadoop.tmp.dir or dfs.name.dir . If it was set
>> to /tmp the contents would be deleted on a OS restart. You need to
>> change this location before you start your NN.
>> Regards
>> Bejoy KS
>>
>> Sent from handheld, please excuse typos.
>> ________________________________
>> From: Abhay Ratnaparkhi <[EMAIL PROTECTED]>
>> Date: Fri, 24 Aug 2012 12:58:41 +0530
>> To: <[EMAIL PROTECTED]>
>> ReplyTo: [EMAIL PROTECTED]
>> Subject: namenode not starting
>>
>> Hello,
>>
>> I had a running hadoop cluster.
>> I restarted it and after that namenode is unable to start. I am
>> getting error saying that it's not formatted. :( Is it possible to
>> recover the data on HDFS?
>>
>> 2012-08-24 03:17:55,378 ERROR
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
>> initialization failed.
>> java.io.IOException: NameNode is not formatted.
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:434)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:110)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:291)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:270)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:271)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:303)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:433)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:421)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1359)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:13
>> 68)
>> 2012-08-24 03:17:55,380 ERROR

Håvard Wahl Kongsgård
Faculty of Medicine &
Department of Mathematical Sciences
NTNU

http://havard.security-review.net/
+
Siddharth Tiwari 2012-08-24, 16:41