Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> question about ZKFC daemon


Copy link to this message
-
Re: question about ZKFC daemon
Hi all,

I´m only testing the new HA feature. I´m not in a production system,

Well, let´s talk about the number of nodes and the ZKFC daemons.

In this url:
https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover

you can read:
If you have configured automatic failover using the ZooKeeper
FailoverController (ZKFC), you must install and start thezkfc daemon on
each of the machines that runs a NameNode.

So, the number of ZKFC daemons are two, but reading this url:

http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper

you can read this:
In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes

I think that to ensure a good HA enviroment (of any kind) you need and odd
number of nodes to avoid split-brain. The problem I see here is that If
ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
(active+standby).

So I´m a bit confussed with this deployment...

Any suggestion?

Thanks in advance for all your answers

Kind regards,

ESGLinux
2013/1/14 Colin McCabe <[EMAIL PROTECTED]>

> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <[EMAIL PROTECTED]>
> wrote:
> > Hi ESGLinux,
> >
> > In production, you need to run QJM on at least 3 nodes.  You also need
> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
> > if you like, though.
>
> Er, this should read "You also need to run ZooKeeper on at least 3
> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
> active NN node and the standby NN node.
>
> Colin
>
> >
> > Of course, none of this is "needed" to set up an example cluster.  If
> > you just want to try something out, you can run everything on the same
> > node if you want.  It depends on what you're trying to do.
> >
> > cheers,
> > Colin
> >
> >
> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <[EMAIL PROTECTED]> wrote:
> >> Thank you for your answer Craig,
> >>
> >> I´m planning my cluster and for now I´m not sure how many machines I
> need;-)
> >>
> >> If I have doubt i´ll what clouder say and If have a problem I have
> where to
> >> ask for explications :-)
> >>
> >> ESGLinux
> >>
> >>
> >>
> >> 2012/12/28 Craig Munro <[EMAIL PROTECTED]>
> >>>
> >>> OK, I have reliable storage on my datanodes so not an issue for me.  If
> >>> that's what Cloudera recommends then I'm sure it's fine.
> >>>
> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <[EMAIL PROTECTED]> wrote:
> >>>>
> >>>> Hi Craig,
> >>>>
> >>>> I´m a bit confused, I have read this from cloudera:
> >>>>
> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage
> >>>>
> >>>> The JournalNode daemon is relatively lightweight, so these daemons can
> >>>> reasonably be collocated on machines with other Hadoop daemons, for
> example
> >>>> NameNodes, the JobTracker, or the YARN ResourceManager.
> >>>> Cloudera recommends that you deploy the JournalNode daemons on the
> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, etc.)
> so the
> >>>> JournalNodes' local directories can use the reliable local storage on
> those
> >>>> machines.
> >>>> There must be at least three JournalNode daemons, since edit log
> >>>> modifications must be written to a majority of JournalNodes
> >>>>
> >>>> as you can read they recommend to put journalnode daemons with the
> >>>> namenodes, but you say the opposite.??¿?¿??
> >>>>
> >>>>
> >>>> Thanks for your answer,
> >>>>
> >>>> ESGLinux,
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> 2012/12/28 Craig Munro <[EMAIL PROTECTED]>
> >>>>>
> >>>>> You need the following:
> >>>>>
> >>>>> - active namenode + zkfc
> >>>>> - standby namenode + zkfc
> >>>>> - pool of journal nodes (odd number, 3 or more)
> >>>>> - pool of zookeeper nodes (odd number, 3 or more)
> >>>>>
> >>>>> As the journal nodes hold the namesystem transactions they should
> not be
> >>>>> co-located with the namenodes in case of failure.  I distribute the