|
|
-
Recovering fsImage from Namenode Logs
ishan chhabra 2012-12-20, 06:26
Hi all, I accidentally issued a rmr on my home directory, but killed the NameNode as soon as i realized it. Currently I am in a situation where my DataNodes have a good percentage of blocks on them, but the NameNode fsImage and editlog don't have a mention of any file or file to block to mappings in that directory. I also don't have any previous checkpoints of fsImage.
Fortunately, what I do have is namenode logs for the past few days that have NameSystem changes recorded. Is there a way to reconstruct my old fsImage from the logs so that it recognizes the blocks that are there on the datanodes? Has anybody tried something like this before?
-- Thanks.
Regards, Ishan Chhabra
-
Re: Recovering fsImage from Namenode Logs
Harsh J 2012-12-20, 06:40
Your NameNode directory, if it still exists, should have a previous.checkpoint/ directory under it where you can extract its previous checkpoint's files and replace the original with that? Ensure its not too old and that things are fine before you eject NN out of safemode finally.
What you say is possible; if you can reconstruct files and their allocated block IDs (along with generation stamps) and thereby form proper INode entries of them to append/recreate your fsimage. They are merely serialized operation entries, understandable if you go over its format in depth. Provided you have all of the exact data required and the skills to browse, understand and modify the code and reconstruct the entries and store them in proper order, this is certainly doable. It is usually simpler to just roll back to the previous checkpoint to save some of the blocks.
On Thu, Dec 20, 2012 at 11:56 AM, ishan chhabra <[EMAIL PROTECTED]> wrote: > Hi all, > I accidentally issued a rmr on my home directory, but killed the NameNode > as soon as i realized it. Currently I am in a situation where my DataNodes > have a good percentage of blocks on them, but the NameNode fsImage and > editlog don't have a mention of any file or file to block to mappings in > that directory. I also don't have any previous checkpoints of fsImage. > > Fortunately, what I do have is namenode logs for the past few days that > have NameSystem changes recorded. Is there a way to reconstruct my old > fsImage from the logs so that it recognizes the blocks that are there on > the datanodes? Has anybody tried something like this before? > > -- > Thanks. > > Regards, > Ishan Chhabra
-- Harsh J
-
Re: Recovering fsImage from Namenode Logs
ishan chhabra 2012-12-20, 08:33
Unfortunately, the checkpoint image that I have has the deletes recorded. I cannot use it. I do have an image that is 15 days old, which I am currently running.
I looked at the my logs and I have the filename, block allocated and generation stamp. Can you explain to me the importance of the generation stamp here? Since my hdfs cluster is operational with the old image and I am writing new data to it, the generation stamp must have been incremented beyond what it was 15 days ago. If I try to restore a block that we written, lets say, 13 days ago, there can be generation stamp collision. So, if I stop my cluster and make the new entries with generation stamp increments after what is currently in the namenode, will it be ok? Is the generation stamp stored somewhere in the datanode or the block stored in the datanode?
Thanks for the clarifications.
On Wed, Dec 19, 2012 at 10:40 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> proper INode entries of them to append/recreate your fsimage. They are -- Thanks.
Regards, Ishan Chhabra
-
Re: Recovering fsImage from Namenode Logs
Colin McCabe 2012-12-27, 20:41
On Thu, Dec 20, 2012 at 12:33 AM, ishan chhabra <[EMAIL PROTECTED]> wrote: > Unfortunately, the checkpoint image that I have has the deletes recorded. I > cannot use it. I do have an image that is 15 days old, which I am currently > running. > > I looked at the my logs and I have the filename, block allocated and > generation stamp. Can you explain to me the importance of the generation > stamp here? Since my hdfs cluster is operational with the old image and I > am writing new data to it, the generation stamp must have been incremented > beyond what it was 15 days ago. If I try to restore a block that we > written, lets say, 13 days ago, there can be generation stamp collision. > So, if I stop my cluster and make the new entries with generation stamp > increments after what is currently in the namenode, will it be ok? Is the > generation stamp stored somewhere in the datanode or the block stored in > the datanode?
The generation stamp is stored by the datanode in the block directory, as part of the .meta filename.
For example, if block -8546336708468389550 has genstamp 1002, you would see something like this:
cmccabe@keter:/h> ls -l /r/data1/current/BP-380817083-127.0.0.1-1356638793552/current/finalized/*8546336708468389550* total 8 -rw-r--r-- 1 cmccabe users 2025 Dec 27 12:38 blk_-8546336708468389550 -rw-r--r-- 1 cmccabe users 23 Dec 27 12:38 blk_-8546336708468389550_1002.meta
cheers, Colin > > Thanks for the clarifications. > > On Wed, Dec 19, 2012 at 10:40 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> proper INode entries of them to append/recreate your fsimage. They are > > > > > -- > Thanks. > > Regards, > Ishan Chhabra
|
|