Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> HA for hadoop-0.20.2


Copy link to this message
-
Re: HA for hadoop-0.20.2
Hi Liu,

HA in Hadoop 2 was nearly a year-long project. There's likely no way you'll
be able to backport it onto 0.20 and end up with a stable product.

If you want to attempt it, by all means go ahead, but I don't expect much
help from the community.

-Todd

On Wed, Nov 14, 2012 at 2:22 AM, lei liu <[EMAIL PROTECTED]> wrote:

> I want to implement HA function for hadoop-0.20.2.
>
> I  use the StandbyNN instance to implement the function. there is one
> thread to  read the edits file in share directory,example NFS, and apply
> transaction log to StandbyNN's namespace, we can call the thread is ingest
> thread.
>
> And the StandbyNN instance also do checkpoint  every one hour. When do
> checkpoint, that need to wait ingest thread to finish the reading and the
> ingest thread exit, and then the checkpoint is finished, StandbyNN instance
> create new ingest thread to read new edits file(edits.new file is renamed
> to edits file).
>
>  If the checkpoint is fail, example upload image file is fail, in the case
> the ingest thread is not exist. I need to handle the exception. I think the
> simple way is to create new ingest thread to read edits file, but there is
> possible that the same edits file is readed more than once,  so same
> transaction log will be applied to StandbyNN‘s namespace more than once.
>
>
> I learn the hadoop-2.0 code,  there are   below code  in
> FSEditLogLoader.loadEditRecords method.
> if (op.hasTransactionId()) {
> if (op.getTransactionId() > expectedTxId) {
> MetaRecoveryContext.editLogLoaderPrompt("There appears " +
> "to be a gap in the edit log. We expected txid " +
> expectedTxId + ", but got txid " +
> op.getTransactionId() + ".", recovery, "ignoring missing " +
> " transaction IDs");
> } *else if (op.getTransactionId() < expectedTxId) {
> MetaRecoveryContext.editLogLoaderPrompt("There appears " +
> "to be an out-of-order edit in the edit log. We " +
> "expected txid " + expectedTxId + ", but got txid " +
> op.getTransactionId() + ".", recovery,
> "skipping the out-of-order edit");
> continue;*
> }
> }
>
> The method use transaction id to guarantee same transaction log is not
> applied to namespace more than once.
>
> But in hadoop-0.20.2, FSEditLog don't store the transaction id into edits
> log file. So I want to know if StandbyNN apply same transaction log to
> namespace more than once, that will lead to the namespace of StandbyNN is
> corrupt?
> Please give me some advice.
>
> Best Regards
> LiuLei
>

--
Todd Lipcon
Software Engineer, Cloudera