Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> why not hadoop backup name node data to local disk daily or hourly?


+
周梦想 2012-12-20, 09:48
Copy link to this message
-
Re: why not hadoop backup name node data to local disk daily or hourly?
Hello Andy,

            NN stores all the metadata in a file called as "fsimage". The
fsimage file contains a snapshot of the HDFS metadata. Along with fsimage
NN also holds  "edit log" files. Whenever there is a change to HDFS, it
gets appended to the edits file. When these log files grow big, they are
merged together with fsimage file. These files are stored on the local FS
at the path specified by the "dfs.name.dir" property in "hdfs-site.xml"
file. To prevent any loss you can give multiple locations as the value for
this property, say 1 on your local disk and another on a network drive in
case you HD get crashed you still have the metadata safe with you in that
network drive.(The condition which you have faced recently)

Now, coming to the SNN. It is a helper node for the NN. SNN periodically
pulls the fsimage file, which would have grown quite big by now. And the NN
starts the cycle again. Suppose, you are ruuning completely out of luck and
loose the entire NN. In such a case you can take his copy of fsimage from
the SNN and retrieve your metadata back.

HTH

Best Regards,
Tariq
+91-9741563634
https://mtariq.jux.com/
On Thu, Dec 20, 2012 at 3:18 PM, 周梦想 <[EMAIL PROTECTED]> wrote:

> Some reasons lead to my name node data error, but the error data also
> overwrite the second name node data, also the NFS backup. I want to recover
> the name node data a day ago or even a week ago,but I can't. I have to back
> up name node data manually or write a bash script to backup it? why  hadoop
> does not give a configure to   backup name node data to local disk daily or
>  hourly with different time stamp name?
>
> The same question is to HBase's .META. and -ROOT- table. I think it's
> history storage is more important 100  times than the log history.
>
> I think it could be implemented in Second Name Node/Check Points Node or
> Back Node. Now I do this just using bash script.
>
> Some one agree with me?
>
>
> Best Regards,
> Andy Zhou
>
+
Mohammad Tariq 2012-12-20, 10:24
+
Harsh J 2012-12-20, 10:34
+
Mohammad Tariq 2012-12-20, 10:43
+
Harsh J 2012-12-20, 10:06
+
Nitin Pawar 2012-12-24, 10:51
+
周梦想 2012-12-24, 10:54