On Thu, Feb 23, 2012 at 12:41 AM, Jeremy Hansen <[EMAIL PROTECTED]> wrote:
> Thanks. Could you clarify what BackupNode does?
Namenode currently keeps the entire file system namespace in memory. It
logs the write operations (create, delete file etc.) into a journal file
called editlog. This journal needs to be merged with the file system image
periodically to avoid journal file growing to a large size. This is called
checkpointing. Checkpoint also reduces the startup time, since the namenode
need not load large editlog file.
Prior to release 0.21, another node called SecondaryNamenode was used for
checkpointing. It periodically gets the file system image and edit, load it
into memory and write checkpoint image. This image is then then shipped to
In 0.21, BackupNode was introduced. Unlike SecondaryNamenode, it gets edits
streamed from the Namenode. It periodically writes the checkpoint image and
ships it back to Namenode. The goal was for this to become Standby node,
towards Namenode HA. Konstantin and few others are pursuing this.
I have not seen any deployments of BackupNode in production. I would love
to hear if any one has deployed it in production and how stable it is.