|
周梦想
2012-12-20, 09:48
Harsh J
2012-12-20, 10:06
Mohammad Tariq
2012-12-20, 10:08
Mohammad Tariq
2012-12-20, 10:24
Harsh J
2012-12-20, 10:34
Mohammad Tariq
2012-12-20, 10:43
Nitin Pawar
2012-12-24, 10:51
周梦想
2012-12-24, 10:54
|
-
why not hadoop backup name node data to local disk daily or hourly?周梦想 2012-12-20, 09:48
Some reasons lead to my name node data error, but the error data also
overwrite the second name node data, also the NFS backup. I want to recover the name node data a day ago or even a week ago,but I can't. I have to back up name node data manually or write a bash script to backup it? why hadoop does not give a configure to backup name node data to local disk daily or hourly with different time stamp name? The same question is to HBase's .META. and -ROOT- table. I think it's history storage is more important 100 times than the log history. I think it could be implemented in Second Name Node/Check Points Node or Back Node. Now I do this just using bash script. Some one agree with me? Best Regards, Andy Zhou
-
Re: why not hadoop backup name node data to local disk daily or hourly?Harsh J 2012-12-20, 10:06
Hi,
On Thu, Dec 20, 2012 at 3:18 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > Some reasons lead to my name node data error, but the error data also > overwrite the second name node data, also the NFS backup. I want to recover > the name node data a day ago or even a week ago,but I can't. The SecondaryNameNode does this, and that is also why it is recommended to run. In HA HDFS, the StandbyNameNode does the same action of checkpoints as SecondaryNameNode, to achieve the same periodic goal. This form of corruption at the SNN too should *never* occur normally, and your SNN last-checkpoint-time should be actively monitored to not grow too old (a sign of issues). Your version of Hadoop probably is still affected by https://issues.apache.org/jira/browse/HDFS-3652 and you should update to avoid loss due to it? Also, if you ever suspect a local copy of NN to be bad, save its namespace (hadoop dfsadmin -saveNamespace, requires NN be put in safemode first) before you bring it down - this saves a copy from the memory onto the disk. > I have to back > up name node data manually or write a bash script to backup it? why hadoop > does not give a configure to backup name node data to local disk daily or > hourly with different time stamp name? If the NN's disk itself is corrupt, backing it up would be no good either, so this solution vs. SNN still doesn't solve anything of your original issue. > The same question is to HBase's .META. and -ROOT- table. I think it's > history storage is more important 100 times than the log history. The HBase .META. and -ROOT- are already on HDFS, so are pretty reliable (with HBase's WAL and 3x replication of blocks). > I think it could be implemented in Second Name Node/Check Points Node or > Back Node. Now I do this just using bash script. I don't think using a bash script to backup the metadata is a better solution than relying on the SecondaryNameNode. Two reasons: It does the same form of a copy-backup (no validation like SNN does), and it does not checkpoint (i.e. merge the edits into the fsimage). -- Harsh J
-
Re: why not hadoop backup name node data to local disk daily or hourly?Mohammad Tariq 2012-12-20, 10:08
Hello Andy,
NN stores all the metadata in a file called as "fsimage". The fsimage file contains a snapshot of the HDFS metadata. Along with fsimage NN also holds "edit log" files. Whenever there is a change to HDFS, it gets appended to the edits file. When these log files grow big, they are merged together with fsimage file. These files are stored on the local FS at the path specified by the "dfs.name.dir" property in "hdfs-site.xml" file. To prevent any loss you can give multiple locations as the value for this property, say 1 on your local disk and another on a network drive in case you HD get crashed you still have the metadata safe with you in that network drive.(The condition which you have faced recently) Now, coming to the SNN. It is a helper node for the NN. SNN periodically pulls the fsimage file, which would have grown quite big by now. And the NN starts the cycle again. Suppose, you are ruuning completely out of luck and loose the entire NN. In such a case you can take his copy of fsimage from the SNN and retrieve your metadata back. HTH Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Thu, Dec 20, 2012 at 3:18 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > Some reasons lead to my name node data error, but the error data also > overwrite the second name node data, also the NFS backup. I want to recover > the name node data a day ago or even a week ago,but I can't. I have to back > up name node data manually or write a bash script to backup it? why hadoop > does not give a configure to backup name node data to local disk daily or > hourly with different time stamp name? > > The same question is to HBase's .META. and -ROOT- table. I think it's > history storage is more important 100 times than the log history. > > I think it could be implemented in Second Name Node/Check Points Node or > Back Node. Now I do this just using bash script. > > Some one agree with me? > > > Best Regards, > Andy Zhou >
-
Re: why not hadoop backup name node data to local disk daily or hourly?Mohammad Tariq 2012-12-20, 10:24
I am sorry Andy, I forgot one important point.
The Secondary NameNode has been deprecated now, so consider using the Checkpoint Node or Backup Node. Checkpoint Node is the process which is actually responsible for creating periodic check points. It downloads fsimage and log edits from the active NameNode, merges them locally, and uploads the new image back to the active NameNode. *It is advisable to run the Checkpoint Node on a different machine as it consumes almost equal amount of memory as that of NameNode. You can start Checkpoint Node by using "bin/hdfs namenode -checkpoint" command. The default value of the maximum delay between two consecutive checkpoints is 1 hour (which is exactly what you want, right???). But you can configure it as per your requirements through "dfs.namenode.checkpoint.period". HTH Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Thu, Dec 20, 2012 at 3:38 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote: > Hello Andy, > > NN stores all the metadata in a file called as "fsimage". The > fsimage file contains a snapshot of the HDFS metadata. Along with fsimage > NN also holds "edit log" files. Whenever there is a change to HDFS, it > gets appended to the edits file. When these log files grow big, they are > merged together with fsimage file. These files are stored on the local FS > at the path specified by the "dfs.name.dir" property in "hdfs-site.xml" > file. To prevent any loss you can give multiple locations as the value for > this property, say 1 on your local disk and another on a network drive in > case you HD get crashed you still have the metadata safe with you in that > network drive.(The condition which you have faced recently) > > Now, coming to the SNN. It is a helper node for the NN. SNN periodically > pulls the fsimage file, which would have grown quite big by now. And the NN > starts the cycle again. Suppose, you are ruuning completely out of luck and > loose the entire NN. In such a case you can take his copy of fsimage from > the SNN and retrieve your metadata back. > > HTH > > Best Regards, > Tariq > +91-9741563634 > https://mtariq.jux.com/ > > > On Thu, Dec 20, 2012 at 3:18 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > >> Some reasons lead to my name node data error, but the error data also >> overwrite the second name node data, also the NFS backup. I want to recover >> the name node data a day ago or even a week ago,but I can't. I have to back >> up name node data manually or write a bash script to backup it? why hadoop >> does not give a configure to backup name node data to local disk daily or >> hourly with different time stamp name? >> >> The same question is to HBase's .META. and -ROOT- table. I think it's >> history storage is more important 100 times than the log history. >> >> I think it could be implemented in Second Name Node/Check Points Node or >> Back Node. Now I do this just using bash script. >> >> Some one agree with me? >> >> >> Best Regards, >> Andy Zhou >> > >
-
Re: why not hadoop backup name node data to local disk daily or hourly?Harsh J 2012-12-20, 10:34
Hi Mohammad,
On Thu, Dec 20, 2012 at 3:54 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote: > I am sorry Andy, I forgot one important point. > > The Secondary NameNode has been deprecated now, so consider using the > Checkpoint Node or Backup Node. Checkpoint Node is the process which is > actually responsible for creating periodic check points. It downloads > fsimage and log edits from the active NameNode, merges them locally, and > uploads the new image back to the active NameNode. This isn't true anymore. We are continuing to keep the SNN and have undeprecated it. See https://issues.apache.org/jira/browse/HDFS-2397. We are perhaps deprecating the CheckpointNode though: https://issues.apache.org/jira/browse/HDFS-4114. -- Harsh J
-
Re: why not hadoop backup name node data to local disk daily or hourly?Mohammad Tariq 2012-12-20, 10:43
Ohhhh...This is the benefit of sharing space with you. Thank you so
much for keeping my knowledge base updated. It' high time, I require a proper re-scan of everything. @Andy : Now i'm truly sorry, for passing on the wrong info. Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Thu, Dec 20, 2012 at 4:04 PM, Harsh J <[EMAIL PROTECTED]> wrote: > Hi Mohammad, > > On Thu, Dec 20, 2012 at 3:54 PM, Mohammad Tariq <[EMAIL PROTECTED]> > wrote: > > I am sorry Andy, I forgot one important point. > > > > The Secondary NameNode has been deprecated now, so consider using the > > Checkpoint Node or Backup Node. Checkpoint Node is the process which is > > actually responsible for creating periodic check points. It downloads > > fsimage and log edits from the active NameNode, merges them locally, and > > uploads the new image back to the active NameNode. > > This isn't true anymore. We are continuing to keep the SNN and have > undeprecated it. See https://issues.apache.org/jira/browse/HDFS-2397. > We are perhaps deprecating the CheckpointNode though: > https://issues.apache.org/jira/browse/HDFS-4114. > > -- > Harsh J >
-
Re: why not hadoop backup name node data to local disk daily or hourly?Nitin Pawar 2012-12-24, 10:51
what do you mean by this "We changed all IPs of the Hadoop System"
You changed the IPs of the nodes in one go? or you retired nodes one by one and changed IPs and brought them back in rotation? Also did you change IP of your NN as well ? On Mon, Dec 24, 2012 at 4:10 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > Actually the problem was beggining at SecondNameNode. We changed all IPs > of the Hadoop System -- Nitin Pawar
-
Re: why not hadoop backup name node data to local disk daily or hourly?周梦想 2012-12-24, 10:54
I stoped the Hadoop, changed every nodes' IP and configured again, and
started Hadoop again. Yes, we did change the IP of NN. 2012/12/24 Nitin Pawar <[EMAIL PROTECTED]> > what do you mean by this "We changed all IPs of the Hadoop System" > > You changed the IPs of the nodes in one go? or you retired nodes one by > one and changed IPs and brought them back in rotation? Also did you change > IP of your NN as well ? > > > > On Mon, Dec 24, 2012 at 4:10 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > >> Actually the problem was beggining at SecondNameNode. We changed all IPs >> of the Hadoop System > > > > > -- > Nitin Pawar > |