Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Rolling config change considered harmful?


Copy link to this message
-
Re: Rolling config change considered harmful?
I got the test cluster into the state described with 2 leaders. I then allocated 100 Curator clients to write nodes "/n" where n is the index (i.e. "/0", "/1", …). The idea that the nodes would be distributed around the cluster instances. I then allocated a single Curator instance dedicated to one of the servers instance, did a sync, and did an exists() to verify that each cluster instances had all the nodes. For the 2 leader cluster, this fails.

-JZ

On Jun 14, 2013, at 1:54 PM, "FPJ" <[EMAIL PROTECTED]> wrote:

> I messed up the last sentence, here is what I was trying to say:
>
> It is ok to have two servers thinking they are leaders as long as only one
> is
> able to commit txns at a time by having a quorum of supporters. Each server
> is going to follow a single leader, so I don't see a problem in your
> scenario
> with the information you provided. Now if you tell me that when you keep
> sending new transactions to those leaders, both keep committing new
> transactions (not the same txns), then we have a problem. I don't see how
> this can happen, though.
>
> Also, one of the leaders should eventually time out and go back to leader
> election.
>
>> -----Original Message-----
>> From: FPJ [mailto:[EMAIL PROTECTED]]
>> Sent: 14 June 2013 21:44
>> To: [EMAIL PROTECTED]
>> Subject: RE: Rolling config change considered harmful?
>>
>> It is ok to have two servers thinking they are leaders as long as only one
> is
>> able to commit txns at a time by having a quorum of supporters. Each
> server
>> is going to follow a single leader, so I don't see a problem in your
> scenario
>> with the information you provided. Now if you tell me that when you keep
>> sending new transactions to those leaders and they keep committing them
>> forever, both keep committing new transactions, then we have a problem. I
>> don't see how this can happen, though.
>>
>> Also, one of the leaders should eventually time out and go back to leader
>> election.
>>
>> -Flavio
>>
>>> -----Original Message-----
>>> From: Jordan Zimmerman [mailto:[EMAIL PROTECTED]]
>>> Sent: 14 June 2013 21:10
>>> To: [EMAIL PROTECTED]
>>> Subject: Re: Rolling config change considered harmful?
>>>
>>> More on this.
>>>
>>> I just did some testing with wholly contrived scenarios and I was able
>>> to
>> get a
>>> cluster in a state where it had two leaders. NOTE: all of this was
>>> done
>> with
>>> Curator's TestingCluster
>>>
>>> * Create a 5 node ensemble
>>> * Save the list of instances, directories etc.
>>> * Wait for quorum
>>> * Shut down the cluster
>>> * Restart the ensemble with the same ports and directories. However,
>>> this time, give different server lists to each instance:
>>> * Instance A -> A D E
>>> * Instance B -> A B C
>>> * Instance C -> A B C
>>> * Instance D -> A D E
>>> * Instance E -> A D E
>>>
>>> There is at least one common server amongst all of them. When I
>>> restart
>> the
>>> cluster with this configuration I ended up with two leaders. This
>>> state
>> stays
>>> consistent after leader election (i.e. it doesn't try to re-elect).
>>>
>>> A: following
>>> B: leading
>>> C: following
>>> D: leading
>>> E: following
>>>
>>> This may be the correct behavior. i.e. it may be that ZooKeeper cannot
>>> realistically run in this scenario. What it means to me is that
>>> rolling
>> config
>>> changes, if too lax, can create chaos.
>>>
>>> -Jordan
>>>
>>> On Jun 14, 2013, at 12:27 PM, "FPJ" <[EMAIL PROTECTED]> wrote:
>>>
>>>> In the case I described, the txn is not reflected in the zookeeper
>> state.
>>>> Say T is a create txn. Once C is elected, it determines the initial
>>>> history of txns for the new epoch that is starting and this initial
>>>> history is not going to include T.
>>>>
>>>> In the example below, I was ignoring the client that triggered T,
>>>> but since it has been acked by a quorum, the client might as well
>>>> have received the confirmation of the operation and think that the