Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> How dose zookeeper handle fault-detect in distributed storage system


+
fengguang gong 2013-09-11, 06:17
+
German Blanco 2013-09-11, 07:12
+
fengguang gong 2013-09-11, 08:11
+
German Blanco 2013-09-11, 10:09
Copy link to this message
-
RE: How dose zookeeper handle fault-detect in distributed storage system
Hi Fengguang,

> Here my question is : How dose zk handle fault-detect in this
> system(how dose dispatch node and middleware know that a store node is
> down).

Adding few more, hope it will help you:)

ZooKeeper has a hierarchal structure and like a distributed file system. You will be able to create znodes, like files creating under directory.

When comes to your usecase, Store node will create a zookeeper client session and register with zookeeper(here store node can create ephemral znode in zookeepe, to show his presence).

Dispatch and middleware nodes can do watching these znodes using zookeeper watch notification mechanism. When the Store goes down, will lose zookeeper connection and the ephemeral znode will be deleted. Inturn Middleware and Dispatch would be receiving the watcher notifications. You can add your logic on these watchers.

Please see the following links to know more about the zNode and watching concepts.

http://zookeeper.apache.org/doc/r3.4.5/zookeeperProgrammers.html#sc_zkDataModel_znodes
http://zookeeper.apache.org/doc/r3.4.5/zookeeperProgrammers.html#Ephemeral+Nodes
http://zookeeper.apache.org/doc/r3.4.5/zookeeperProgrammers.html#ch_zkWatches

-Rakesh

-----Original Message-----
From: German Blanco [mailto:[EMAIL PROTECTED]]
Sent: 11 September 2013 15:39
To: [EMAIL PROTECTED]
Subject: Re: How dose zookeeper handle fault-detect in distributed storage system

I am not sure that I get your point.
You need a ZooKeeper ensemble with several servers (minimun recommended number is 3). One of them will be selected as a leader of the ensemble when required, but normally you don't have to worry about that.
Normally you will need to run the servers in the ZooKeeper ensemble as independent processes, and each of them should run in a different machine in order to increase redundancy. These processes could be running in any machine in your network, including I assume, the machines that host the dispatch nodes, store nodes and middleware. On top of that, ZooKeeper clients must be somehow linked with your dispatch/store/middleware nodes and manage the information stored in the ZooKeeper ensemble.

On Wed, Sep 11, 2013 at 10:11 AM, fengguang gong
<[EMAIL PROTECTED]>wrote:

> Thanks very much German,
>         The second possibility of you answer will be great. But i
> still confused about  the servers(Leader) and clients.
> Should i distinguish servers(Leader) and clients between dispatch
> nodes, store nodes and middleware?
> Or should i just ignore all this concepts?
> 在 2013-9-11,下午3:12,German Blanco <[EMAIL PROTECTED]> 写道:
>
> > Hello Fengguang Gong,
> > I think there is more than one answer to your question.
> > One possibility would be to have each of your nodes as zookeeper
> > clients that create an ephemeral node in the zookeeper data, and
> > query and most likely subscribe to changes so that they are notified
> > about the status of the ephemeral zookeeper nodes created by the
> > rest of the nodes. If you
> are
> > only interested in dispatch and middleware nodes knowing about the
> > status of store nodes, then you could have ephemeral zookeeper nodes
> > created
> only
> > by the store nodes, and dispatch and middleware nodes querying and
> > subscribing to the resulting status.
> > You will need to make sure that the events of store nodes going up
> > and
> down
> > are reflected correctly in the creation and deletion of the
> > zookeeper
> node.
> > You will also have to tune the heartbeat between zookeeper client
> > and server so that it fits your requirements.
> > Does that suit you?
> > Any other options?
> > Good luck :-)
> >
> >
> > On Wed, Sep 11, 2013 at 8:17 AM, fengguang gong <
> [EMAIL PROTECTED]>wrote:
> >
> >> Hi all:
> >>
> >>        Recently my lab want to use zk to manager our cluster(Fault
> >> detect). Our cluster includes three kinds of node:
> >> 1. dispatch node : load balance and dispatch data.
> >> 2. store node: receive data from dispatch node and store.