On the first run you want namenode to initialize its directories (where it
store VERSION file, fsimage and edits).
On the subsequent formats - you are making sure you have a new EMPTY file
system. If you don't do format NameNode will load up fsimage and edits.
There is also matter of generating new space id, which is matched against
Datanode's ones. So if you format Namenode you need to cleanup data from
On the other hand, if you just add Datanodes to a running cluster - you
don't have to format NN.
On 3/9/11 8:27 PM, "Adarsh Sharma" <[EMAIL PROTECTED]> wrote:
> Dear all,
> I have configured several times a Hadoop Cluster of 2,3,5,8 nodes but
> one doubt in my mind always occur.
> Why it is necessary to format Hadoop Namenode by *bin/hadoop -namenode
> format *command.
> What is the reason and logic behind this.
> Please justify if someone knows.
> Thanks & best Regards,
> Adarsh Sharma