Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - question about ZKFC daemon


+
ESGLinux 2012-12-27, 12:03
+
rahul p 2012-12-28, 07:00
+
Harsh J 2012-12-27, 16:34
+
ESGLinux 2012-12-28, 09:25
+
Craig Munro 2012-12-28, 10:08
+
ESGLinux 2012-12-28, 10:37
+
Craig Munro 2012-12-28, 10:51
+
ESGLinux 2012-12-28, 11:02
+
Colin McCabe 2013-01-14, 19:49
+
Colin McCabe 2013-01-14, 20:34
+
ESGLinux 2013-01-15, 09:53
+
Harsh J 2013-01-15, 09:55
Copy link to this message
-
Re: question about ZKFC daemon
ESGLinux 2013-01-15, 10:08
Hi Harsh,

Now I´m confussed at all :-))))

as you pointed ZKFC runs only in the NN. That´s looks right.

So, what are ZK peers (the odd number I´m looking for) and where I have to
run them? on another 3 nodes?

As I can read from the previous url:

In a typical deployment, ZooKeeper daemons are configured to run on three
or five nodes. Since ZooKeeper itself has light resource requirements, it
is acceptable to collocate the ZooKeeper nodes on the same hardware as the
HDFS NameNode and Standby Node. Many operators choose to deploy the third
ZooKeeper process on the same node as the YARN ResourceManager. It is
advisable to configure the ZooKeeper nodes to store their data on separate
disk drives from the HDFS metadata for best performance and isolation.

Here,  ZooKeeper daemons = ZKFC?
Thanks

ESGLinux,

2013/1/15 Harsh J <[EMAIL PROTECTED]>

> Hi,
>
> I fail to see your confusion.
>
> ZKFC != ZK
>
> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in
> numbers, such as JNs are to be.
>
> ZKFC is something the NN needs for its Automatic Failover capability. It
> is a client to ZK and thereby demands ZK's presence; for which the odd # of
> nodes is suggested. ZKFC itself is only to be run one per NN.
>
>
> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <[EMAIL PROTECTED]> wrote:
>
>> Hi all,
>>
>> I´m only testing the new HA feature. I´m not in a production system,
>>
>> Well, let´s talk about the number of nodes and the ZKFC daemons.
>>
>> In this url:
>>
>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover
>>
>> you can read:
>> If you have configured automatic failover using the ZooKeeper
>> FailoverController (ZKFC), you must install and start thezkfc daemon on
>> each of the machines that runs a NameNode.
>>
>> So, the number of ZKFC daemons are two, but reading this url:
>>
>>
>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper
>>
>> you can read this:
>> In a typical deployment, ZooKeeper daemons are configured to run on three
>> or five nodes
>>
>> I think that to ensure a good HA enviroment (of any kind) you need and
>> odd number of nodes to avoid split-brain. The problem I see here is that If
>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN
>> (active+standby).
>>
>> So I´m a bit confussed with this deployment...
>>
>> Any suggestion?
>>
>> Thanks in advance for all your answers
>>
>> Kind regards,
>>
>> ESGLinux
>>
>>
>>
>>
>> 2013/1/14 Colin McCabe <[EMAIL PROTECTED]>
>>
>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <[EMAIL PROTECTED]>
>>> wrote:
>>> > Hi ESGLinux,
>>> >
>>> > In production, you need to run QJM on at least 3 nodes.  You also need
>>> > to run ZKFC on at least 3 nodes.  You can run them on the same nodes
>>> > if you like, though.
>>>
>>> Er, this should read "You also need to run ZooKeeper on at least 3
>>> nodes."  ZKFC, which talks to ZooKeeper, runs on only two nodes-- the
>>> active NN node and the standby NN node.
>>>
>>> Colin
>>>
>>> >
>>> > Of course, none of this is "needed" to set up an example cluster.  If
>>> > you just want to try something out, you can run everything on the same
>>> > node if you want.  It depends on what you're trying to do.
>>> >
>>> > cheers,
>>> > Colin
>>> >
>>> >
>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <[EMAIL PROTECTED]> wrote:
>>> >> Thank you for your answer Craig,
>>> >>
>>> >> I´m planning my cluster and for now I´m not sure how many machines I
>>> need;-)
>>> >>
>>> >> If I have doubt i´ll what clouder say and If have a problem I have
>>> where to
>>> >> ask for explications :-)
>>> >>
>>> >> ESGLinux
>>> >>
>>> >>
>>> >>
>>> >> 2012/12/28 Craig Munro <[EMAIL PROTECTED]>
>>> >>>
>>> >>> OK, I have reliable storage on my datanodes so not an issue for me.
>>>  If
>>> >>> that's what Cloudera recommends then I'm sure it's fine.
+
Harsh J 2013-01-15, 10:11
+
ESGLinux 2013-01-15, 10:17