Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # user >> Rolling config change considered harmful?


Copy link to this message
-
Re: Rolling config change considered harmful?
I got the test cluster into the state described with 2 leaders. I then allocated 100 Curator clients to write nodes "/n" where n is the index (i.e. "/0", "/1", …). The idea that the nodes would be distributed around the cluster instances. I then allocated a single Curator instance dedicated to one of the servers instance, did a sync, and did an exists() to verify that each cluster instances had all the nodes. For the 2 leader cluster, this fails.

-JZ

On Jun 14, 2013, at 1:54 PM, "FPJ" <[EMAIL PROTECTED]> wrote:

> I messed up the last sentence, here is what I was trying to say:
>
> It is ok to have two servers thinking they are leaders as long as only one
> is
> able to commit txns at a time by having a quorum of supporters. Each server
> is going to follow a single leader, so I don't see a problem in your
> scenario
> with the information you provided. Now if you tell me that when you keep
> sending new transactions to those leaders, both keep committing new
> transactions (not the same txns), then we have a problem. I don't see how
> this can happen, though.
>
> Also, one of the leaders should eventually time out and go back to leader
> election.
>
>> -----Original Message-----
>> From: FPJ [mailto:[EMAIL PROTECTED]]
>> Sent: 14 June 2013 21:44
>> To: [EMAIL PROTECTED]
>> Subject: RE: Rolling config change considered harmful?
>>
>> It is ok to have two servers thinking they are leaders as long as only one
> is
>> able to commit txns at a time by having a quorum of supporters. Each
> server
>> is going to follow a single leader, so I don't see a problem in your
> scenario
>> with the information you provided. Now if you tell me that when you keep
>> sending new transactions to those leaders and they keep committing them
>> forever, both keep committing new transactions, then we have a problem. I
>> don't see how this can happen, though.
>>
>> Also, one of the leaders should eventually time out and go back to leader
>> election.
>>
>> -Flavio
>>
>>> -----Original Message-----
>>> From: Jordan Zimmerman [mailto:[EMAIL PROTECTED]]
>>> Sent: 14 June 2013 21:10
>>> To: [EMAIL PROTECTED]
>>> Subject: Re: Rolling config change considered harmful?
>>>
>>> More on this.
>>>
>>> I just did some testing with wholly contrived scenarios and I was able
>>> to
>> get a
>>> cluster in a state where it had two leaders. NOTE: all of this was
>>> done
>> with
>>> Curator's TestingCluster
>>>
>>> * Create a 5 node ensemble
>>> * Save the list of instances, directories etc.
>>> * Wait for quorum
>>> * Shut down the cluster
>>> * Restart the ensemble with the same ports and directories. However,
>>> this time, give different server lists to each instance:
>>> * Instance A -> A D E
>>> * Instance B -> A B C
>>> * Instance C -> A B C
>>> * Instance D -> A D E
>>> * Instance E -> A D E
>>>
>>> There is at least one common server amongst all of them. When I
>>> restart
>> the
>>> cluster with this configuration I ended up with two leaders. This
>>> state
>> stays
>>> consistent after leader election (i.e. it doesn't try to re-elect).
>>>
>>> A: following
>>> B: leading
>>> C: following
>>> D: leading
>>> E: following
>>>
>>> This may be the correct behavior. i.e. it may be that ZooKeeper cannot
>>> realistically run in this scenario. What it means to me is that
>>> rolling
>> config
>>> changes, if too lax, can create chaos.
>>>
>>> -Jordan
>>>
>>> On Jun 14, 2013, at 12:27 PM, "FPJ" <[EMAIL PROTECTED]> wrote:
>>>
>>>> In the case I described, the txn is not reflected in the zookeeper
>> state.
>>>> Say T is a create txn. Once C is elected, it determines the initial
>>>> history of txns for the new epoch that is starting and this initial
>>>> history is not going to include T.
>>>>
>>>> In the example below, I was ignoring the client that triggered T,
>>>> but since it has been acked by a quorum, the client might as well
>>>> have received the confirmation of the operation and think that the
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB