Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # user >> Zookeeper Cluster with quorum Lost, what next ?


+
Seb Geek 2013-10-27, 17:38
Copy link to this message
-
RE: Zookeeper Cluster with quorum Lost, what next ?
Hi Sébastien,

>>>>>In case the quorum is lost in a Zookeeper Cluster of 2 * n + 1 nodes (so more than n server died, or only one server is still alive for example), what happen next :

Here the all live servers will be participating to form quorum by doing leader election and will be in LOOKING state.

>>>>>- do other alive servers of the cluster continue to respond to all request of clients (read request and write request) ?
>>>>- do other alive servers of the cluster continue to respond but only to read request of clients (write request are all rejected) ?
>>>>- do other alive servers reject all requests from client ?
For performing write/read operation, it requires quorum(LEADER and FOLLOWER's). In your case, it has 2 * n + 1 failures, all the live servers will be participating in quorum formation and in LOOKING state. All the existing clients would get notified with DISCONNECTED event, also any read/write client requests will be rejected and you can see connection loss exceptions in the client side.

If you want to serve read client request(using alive servers) after quorum partition or fails, there is another good feature in ZooKeeper - read-only server.
In this case, the live servers will transition to read-only mode and will serve the read requests from read-only clients. Also, in background all these servers will be participating in leader election to form quorum.
Please refer 'Read Only Mode Server' section in http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Experimental+Options%2FFeatures

>>>>- do other alive servers die ?
No, all the live servers would continuously tries to form quorum and would be in leader election phase. Now the Server state will be LOOKING.
Please refer http://zookeeper.apache.org/doc/r3.2.2/zookeeperInternals.html#sc_leaderElection

>>>>- do other alive servers reject connection from client that were connected to a server that died ?
>>>>- do other alive servers accept new connection from client (wether or not these client were connected with a session to an other serveer that died) ?
>>>>- do other alive servers disconnect all client ?
Existing Clients will get DISCONNECTED event and will infinitely retries connecting to all the servers mentioned in the connect string. New client establishment also will fail and client will continue to retries infinitely.
In general, say ZooKeeper quorum losts:
- Live servers will be in leader election phase and rejects all the client connections.
- Existing clients will get DISCONNECTED event and will infinitely retries to all the servers mentioned in the connect string.
http://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#ch_zkSessions
There is a special feature addition of read-only server to serve read-only client requests even after quorum lost. Please refer following link to understand more on this:
https://issues.apache.org/jira/i#browse/ZOOKEEPER-784
http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#Experimental+Options%2FFeatures

Hope these information will be helpful.

-Rakesh

-----Original Message-----
From: Seb Geek [mailto:[EMAIL PROTECTED]]
Sent: 27 October 2013 23:09
To: [EMAIL PROTECTED]
Subject: Zookeeper Cluster with quorum Lost, what next ?

Hello,

My Context : I want to create a cluster of 2 * n + 1 nodes zookeeper servers and several java clients (like Solr Cloud nodes or homemade java client that implement and use receipt for Zookeeper).

In case the quorum is lost in a Zookeeper Cluster of 2 * n + 1 nodes (so more than n server died, or only one server is still alive for example), what happen next :

- do other alive servers of the cluster continue to respond to all request of clients (read request and write request) ?
- do other alive servers of the cluster continue to respond but only to read request of clients (write request are all rejected) ?
- do other alive servers reject all requests from client ?
- do other alive servers die ?

- do other alive servers reject connection from client that were connected to a server that died ?
- do other alive servers accept new connection from client (wether or not these client were connected with a session to an other serveer that died) ?
- do other alive servers disconnect all client ?
And my last question : What happen next when the quorum is restored (died servers are restarted) ? are pending write requests applied in the zookeeper cluster ? ...

Thank you in advance for your answers,

Regards,
Sébastien