Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Is this a fair summary of HDFS failover?


Copy link to this message
-
Re: Is this a fair summary of HDFS failover?
I completely agree, and I am using yours and the group's posting to define
the direction and approaches, but I am also trying every solution - and I am
beginning to do just that, the AvatarNode now.

Thank you,
Mark

On Mon, Feb 14, 2011 at 4:43 PM, M. C. Srivas <[EMAIL PROTECTED]> wrote:

> I understand you are writing a book "Hadoop in Practice".  If so, its
> important that what's recommended in the book should be verified in
> practice. (I mean, beyond simply posting in this newsgroup - for instance,
> the recommendations on NN fail-over should be tried out first before
> writing
> about how to do it). Otherwise you won't know your recommendations really
> work or not.
>
>
>
> On Mon, Feb 14, 2011 at 12:31 PM, Mark Kerzner <[EMAIL PROTECTED]
> >wrote:
>
> > Thank you, M. C. Srivas, that was enormously useful. I understand it now,
> > but just to be complete, I have re-formulated my points according to your
> > comments:
> >
> >   - In 0.20 the Secondary NameNode performs snapshotting. Its data can be
> >   used to recreate the HDFS if the Primary NameNode fails. The procedure
> is
> >   manual and may take hours, and there is also data loss since the last
> >   snapshot;
> >   - In 0.21 there is a Backup Node (HADOOP-4539), which aims to help with
> >   HA and act as a cold spare. The data loss is less than with Secondary
> NN,
> >   but it is still manual and potentially error-prone, and it takes hours;
> >   - There is an AvatarNode patch available for 0.20, and Facebook runs
> its
> >   cluster that way, but the patch submitted to Apache requires testing
> and
> > the
> >   developers adopting it must do some custom configurations and also
> > exercise
> >   care in their work.
> >
> > As a conclusion, when building an HA HDFS cluster, one needs to follow
> the
> > best
> > practices outlined by Tom
> > White<
> > http://www.cloudera.com/wp-content/uploads/2010/03/HDFS_Reliability.pdf
> >,
> > and may still need to resort to specialized NSF filers for running the
> > NameNode.
> >
> > Sincerely,
> > Mark
> >
> >
> >
> > On Mon, Feb 14, 2011 at 11:50 AM, M. C. Srivas <[EMAIL PROTECTED]>
> wrote:
> >
> > > The summary is quite inaccurate.
> > >
> > > On Mon, Feb 14, 2011 at 8:48 AM, Mark Kerzner <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > is it accurate to say that
> > > >
> > > >   - In 0.20 the Secondary NameNode acts as a cold spare; it can be
> used
> > > to
> > > >   recreate the HDFS if the Primary NameNode fails, but with the delay
> > of
> > > >   minutes if not hours, and there is also some data loss;
> > > >
> > >
> > >
> > > The Secondary NN is not a spare. It is used to augment the work of the
> > > Primary, by offloading some of its work to another machine. The work
> > > offloaded is "log rollup" or "checkpointing". This has been a source of
> > > constant confusion (some named it incorrectly as a "secondary" and now
> we
> > > are stuck with it).
> > >
> > > The Secondary NN certainly cannot take over for the Primary. It is not
> > its
> > > purpose.
> > >
> > > Yes, there is data loss.
> > >
> > >
> > >
> > >
> > > >   - in 0.21 there are streaming edits to a Backup Node (HADOOP-4539),
> > > which
> > > >   replaces the Secondary NameNode. The Backup Node can be used as a
> > warm
> > > >   spare, with the failover being a matter of seconds. There can be
> > > multiple
> > > >   Backup Nodes, for additional insurance against failure, and
> previous
> > > best
> > > >   common practices apply to it;
> > > >
> > >
> > >
> > > There is no "Backup NN" in the manner you are thinking of. It is
> > completely
> > > manual, and requires restart of the "whole world", and takes about 2-3
> > > hours
> > > to happen. If you are lucky, you may have only a little data loss
> (people
> > > have lost entire clusters due to this -- from what I understand, you
> are
> > > far
> > > better off resurrecting the Primary instead of trying to bring up a
> > Backup
> > > NN).
> > >
> > > In any case, when you run it like you mention above, you will have to