Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Is this a fair summary of HDFS failover?


Copy link to this message
-
Re: Is this a fair summary of HDFS failover?
I completely agree, and I am using yours and the group's posting to define
the direction and approaches, but I am also trying every solution - and I am
beginning to do just that, the AvatarNode now.

Thank you,
Mark

On Mon, Feb 14, 2011 at 4:43 PM, M. C. Srivas <[EMAIL PROTECTED]> wrote:

> I understand you are writing a book "Hadoop in Practice".  If so, its
> important that what's recommended in the book should be verified in
> practice. (I mean, beyond simply posting in this newsgroup - for instance,
> the recommendations on NN fail-over should be tried out first before
> writing
> about how to do it). Otherwise you won't know your recommendations really
> work or not.
>
>
>
> On Mon, Feb 14, 2011 at 12:31 PM, Mark Kerzner <[EMAIL PROTECTED]
> >wrote:
>
> > Thank you, M. C. Srivas, that was enormously useful. I understand it now,
> > but just to be complete, I have re-formulated my points according to your
> > comments:
> >
> >   - In 0.20 the Secondary NameNode performs snapshotting. Its data can be
> >   used to recreate the HDFS if the Primary NameNode fails. The procedure
> is
> >   manual and may take hours, and there is also data loss since the last
> >   snapshot;
> >   - In 0.21 there is a Backup Node (HADOOP-4539), which aims to help with
> >   HA and act as a cold spare. The data loss is less than with Secondary
> NN,
> >   but it is still manual and potentially error-prone, and it takes hours;
> >   - There is an AvatarNode patch available for 0.20, and Facebook runs
> its
> >   cluster that way, but the patch submitted to Apache requires testing
> and
> > the
> >   developers adopting it must do some custom configurations and also
> > exercise
> >   care in their work.
> >
> > As a conclusion, when building an HA HDFS cluster, one needs to follow
> the
> > best
> > practices outlined by Tom
> > White<
> > http://www.cloudera.com/wp-content/uploads/2010/03/HDFS_Reliability.pdf
> >,
> > and may still need to resort to specialized NSF filers for running the
> > NameNode.
> >
> > Sincerely,
> > Mark
> >
> >
> >
> > On Mon, Feb 14, 2011 at 11:50 AM, M. C. Srivas <[EMAIL PROTECTED]>
> wrote:
> >
> > > The summary is quite inaccurate.
> > >
> > > On Mon, Feb 14, 2011 at 8:48 AM, Mark Kerzner <[EMAIL PROTECTED]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > is it accurate to say that
> > > >
> > > >   - In 0.20 the Secondary NameNode acts as a cold spare; it can be
> used
> > > to
> > > >   recreate the HDFS if the Primary NameNode fails, but with the delay
> > of
> > > >   minutes if not hours, and there is also some data loss;
> > > >
> > >
> > >
> > > The Secondary NN is not a spare. It is used to augment the work of the
> > > Primary, by offloading some of its work to another machine. The work
> > > offloaded is "log rollup" or "checkpointing". This has been a source of
> > > constant confusion (some named it incorrectly as a "secondary" and now
> we
> > > are stuck with it).
> > >
> > > The Secondary NN certainly cannot take over for the Primary. It is not
> > its
> > > purpose.
> > >
> > > Yes, there is data loss.
> > >
> > >
> > >
> > >
> > > >   - in 0.21 there are streaming edits to a Backup Node (HADOOP-4539),
> > > which
> > > >   replaces the Secondary NameNode. The Backup Node can be used as a
> > warm
> > > >   spare, with the failover being a matter of seconds. There can be
> > > multiple
> > > >   Backup Nodes, for additional insurance against failure, and
> previous
> > > best
> > > >   common practices apply to it;
> > > >
> > >
> > >
> > > There is no "Backup NN" in the manner you are thinking of. It is
> > completely
> > > manual, and requires restart of the "whole world", and takes about 2-3
> > > hours
> > > to happen. If you are lucky, you may have only a little data loss
> (people
> > > have lost entire clusters due to this -- from what I understand, you
> are
> > > far
> > > better off resurrecting the Primary instead of trying to bring up a
> > Backup
> > > NN).
> > >
> > > In any case, when you run it like you mention above, you will have to
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB