Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - Zookeeper Configuration Sync


Copy link to this message
-
Re: Zookeeper Configuration Sync
Alexander Shraer 2013-06-22, 06:47
Hi Mohammad,

+1 for the unique ensemble identifier request. We actually discussed
this a long time ago but somehow never got to do this.
Can you open a JIRA for this ?

Suppose that server A has the latest log but only talks with server B
during leader election (C is down or slow). A doesn't know whether
or not the latest operations in its log are committed (in this case C
would have them, but A doesn't know if this is the case). So to be
safe
everything in A's log gets committed in this case.

We took the approach that a reconfiguration is treated similarly to
normal data updates. When a server has the most up-to-date log and
talks with a majority during leader election, it will be elected
leader and commit its log to the other servers. It won't truncate its
log even
if its clear that some operations were not committed. This is true for
normal updates as well as for reconfigurations.

BTW, I'm not sure why you are shutting down servers or clearing the
data during reconfigurations, or why you're manually changing config
files.
You can add servers to the ensemble by invoking the reconfig command
and this will make all the necessary changes to the config files,
including specifying the right config version.

Alex
On Fri, Jun 21, 2013 at 3:00 PM, Mohammad Shamma
<[EMAIL PROTECTED]> wrote:
> I have a use case where I dynamically grow a zookeeper ensemble on the same
> fixed set of machines multiple times. In each iteration, the ensemble is
> grown incrementally till it consists of "n" servers. I will refer to the
> machines hosting the servers as zk-1, zk-2, ..., zk-n.
>
> At the beginning of each iteration, I wipe out the zookeeper data
> directories of zk-1 and zk-2, then statically configure the zookeeper
> servers on both of them to form a 2-way ensemble. After that, I start
> growing the ensemble incrementally by reconfiguring the zookeeper ensemble
> to include zk-i, then clearing, configure and starting the zookeeper server
> on zk-i (that is for i in range(2,n)).
>
> I was not shutting down or cleaning up the previous ensemble zookeeper
> servers at the end of each iteration. After initializing the 2-way ensemble
> on zk-1 and zk-2, I observed that the servers from the old deployment were
> contacting the servers of the new ensemble and triggering an ensemble
> reconfiguration. A quick look at the code seems to suggest that this is
> simply triggered by the virtue that the config version value of the old
> deployment server is higher than that of that found on the new ensemble
> servers. Can anyone confirm my understanding of this behaviour of zookeeper?
>
> I also noticed that his reconfiguration holds true for n=3. For example
> lets say zookeeper  servers on zk-1 and zk-2 are freshly configured to form
> a 2-way ensemble, and zk-3 contains a leftover server that was part of an
> older 3-way ensemble (that included two obselete servers on zk-1 and zk-2).
> To me it seems a bit counter intuitive for one server (on zk-3) to drive
> the configuration of two other servers (zk1, zk2). The reason why it
> seems counter intuitive is that the majority of the servers in the ensemble
> agree on a different config version. Can somebody explain how zookeeper
> handles this situation?
>
> One final note, it would be really useful if a zookeeper ensemble would
> have a unique identifier that could be set in the "zoo.cfg" file. Whenever
> servers communicate witch each other, they would verify that they are
> talking to peers of the same ensemble before commencing with further
> actions. Does that sound like a reasonable request?
>
> Thanks,
>
> --
> Mohammad Shamma