-Re: why not hadoop backup name node data to local disk daily or hourly?
Mohammad Tariq 2012-12-20, 10:24
I am sorry Andy, I forgot one important point.
The Secondary NameNode has been deprecated now, so consider using the
Checkpoint Node or Backup Node. Checkpoint Node is the process which is
actually responsible for creating periodic check points. It downloads
fsimage and log edits from the active NameNode, merges them locally, and
uploads the new image back to the active NameNode.
*It is advisable to run the Checkpoint Node on a different machine as it
consumes almost equal amount of memory as that of NameNode.
You can start Checkpoint Node by using "bin/hdfs namenode -checkpoint"
The default value of the maximum delay between two consecutive checkpoints
is 1 hour (which is exactly what you want, right???). But you can configure
it as per your requirements through "dfs.namenode.checkpoint.period".
On Thu, Dec 20, 2012 at 3:38 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
> Hello Andy,
> NN stores all the metadata in a file called as "fsimage". The
> fsimage file contains a snapshot of the HDFS metadata. Along with fsimage
> NN also holds "edit log" files. Whenever there is a change to HDFS, it
> gets appended to the edits file. When these log files grow big, they are
> merged together with fsimage file. These files are stored on the local FS
> at the path specified by the "dfs.name.dir" property in "hdfs-site.xml"
> file. To prevent any loss you can give multiple locations as the value for
> this property, say 1 on your local disk and another on a network drive in
> case you HD get crashed you still have the metadata safe with you in that
> network drive.(The condition which you have faced recently)
> Now, coming to the SNN. It is a helper node for the NN. SNN periodically
> pulls the fsimage file, which would have grown quite big by now. And the NN
> starts the cycle again. Suppose, you are ruuning completely out of luck and
> loose the entire NN. In such a case you can take his copy of fsimage from
> the SNN and retrieve your metadata back.
> Best Regards,
> On Thu, Dec 20, 2012 at 3:18 PM, 周梦想 <[EMAIL PROTECTED]> wrote:
>> Some reasons lead to my name node data error, but the error data also
>> overwrite the second name node data, also the NFS backup. I want to recover
>> the name node data a day ago or even a week ago,but I can't. I have to back
>> up name node data manually or write a bash script to backup it? why hadoop
>> does not give a configure to backup name node data to local disk daily or
>> hourly with different time stamp name?
>> The same question is to HBase's .META. and -ROOT- table. I think it's
>> history storage is more important 100 times than the log history.
>> I think it could be implemented in Second Name Node/Check Points Node or
>> Back Node. Now I do this just using bash script.
>> Some one agree with me?
>> Best Regards,
>> Andy Zhou