-Re: why not hadoop backup name node data to local disk daily or hourly?
Harsh J 2012-12-20, 10:06
On Thu, Dec 20, 2012 at 3:18 PM, 周梦想 <[EMAIL PROTECTED]> wrote:
> Some reasons lead to my name node data error, but the error data also
> overwrite the second name node data, also the NFS backup. I want to recover
> the name node data a day ago or even a week ago,but I can't.
The SecondaryNameNode does this, and that is also why it is
recommended to run. In HA HDFS, the StandbyNameNode does the same
action of checkpoints as SecondaryNameNode, to achieve the same
This form of corruption at the SNN too should *never* occur normally,
and your SNN last-checkpoint-time should be actively monitored to not
grow too old (a sign of issues). Your version of Hadoop probably is
still affected by https://issues.apache.org/jira/browse/HDFS-3652 and
you should update to avoid loss due to it?
Also, if you ever suspect a local copy of NN to be bad, save its
namespace (hadoop dfsadmin -saveNamespace, requires NN be put in
safemode first) before you bring it down - this saves a copy from the
memory onto the disk.
> I have to back
> up name node data manually or write a bash script to backup it? why hadoop
> does not give a configure to backup name node data to local disk daily or
> hourly with different time stamp name?
If the NN's disk itself is corrupt, backing it up would be no good
either, so this solution vs. SNN still doesn't solve anything of your
> The same question is to HBase's .META. and -ROOT- table. I think it's
> history storage is more important 100 times than the log history.
The HBase .META. and -ROOT- are already on HDFS, so are pretty
reliable (with HBase's WAL and 3x replication of blocks).
> I think it could be implemented in Second Name Node/Check Points Node or
> Back Node. Now I do this just using bash script.
I don't think using a bash script to backup the metadata is a better
solution than relying on the SecondaryNameNode. Two reasons: It does
the same form of a copy-backup (no validation like SNN does), and it
does not checkpoint (i.e. merge the edits into the fsimage).