Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # user - How to join quorum without restarting existing servers


Copy link to this message
-
Re: How to join quorum without restarting existing servers
Diego Oliveira 2013-11-06, 17:39
Bae,

   Just a note, when using Zookeeper in amazon AWS, the instance IP
relocation at restart is a nightmare. One solution is to do as you sad,
using an elastic IP, but the max number 5 is limiting. One option is to
configure a VPC. I got this problems last year.

Att,
      Diego.
On Tue, Nov 5, 2013 at 4:18 PM, Bae, Jae Hyeon <[EMAIL PROTECTED]> wrote:

> I am attaching log file. Could you take a look why the new instance cannot
> join quorum?
>
>
> On Tue, Nov 5, 2013 at 9:52 AM, Bae, Jae Hyeon <[EMAIL PROTECTED]> wrote:
>
>> Thanks a lot Ben
>>
>> We are also using zookeeper in AWS with elastic IP. Why I asked this
>> question is, when the bad Zookeeper EC2 instance is terminated and new
>> instance is launched with the previous elastic IP, it cannot join quorum
>> without any specific error messages. But when I did rolling restart, the
>> new instance started normally, synchronized and joined quorum.
>>
>> As I understand German's response, the new instance should start,
>> synchronize, and join quorum successfully without any impact on existing
>> instances but it didn't. I will investigate further.
>>
>> Thank you
>> Best, Jae
>>
>>
>> On Tue, Nov 5, 2013 at 8:24 AM, Ben Hall <[EMAIL PROTECTED]> wrote:
>>
>>> Hi Jae,
>>>
>>> I wrote that article several years ago. (tbh - I hope it is not totally
>>> out of date by now).  I agree with German's points.
>>>
>>> The issue it was solving was to replace a bad server without having to
>>> shutdown the ensemble and without having to update the config files on
>>> each server. I would also add that this only works as long as the server
>>> names and ports are the same - iirc at the time the article was written
>>> we
>>> were using servers in AWS and referencing them either by assigned
>>> hostnames such as zookeeper-[01|11] or by elastic IP's that could be
>>> moved
>>> from server to server.
>>>
>>> If I understand your question correctly, if you are "adding a new server"
>>> such as going from 7 to 9 servers, then this approach won't benefit you
>>> as
>>> you.
>>>
>>> We also used this approach when we would upgrade the servers, but like
>>> German said we did it one server at a time so that the Leader election
>>> could be natural.  This allowed us to upgrade a pool of 11 servers who
>>> were responsible for many thousands of client connections without any
>>> down
>>> time.
>>>
>>> Thanks
>>> Ben
>>>
>>>
>>> On 11/5/13 6:51 AM, "German Blanco" <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>> >... and make sure that there is no rubbish in the data dir of the new
>>> >server.
>>> >
>>> >
>>> >On Tue, Nov 5, 2013 at 3:49 PM, German Blanco <
>>> >[EMAIL PROTECTED]> wrote:
>>> >
>>> >> Hello Jae,
>>> >>
>>> >> I think that the answer to your question is "no, there is no benefit
>>> in
>>> >>a
>>> >> rolling restart in that case".
>>> >> If you remove a machine that was hosting a zookeeper server that was
>>> >>part
>>> >> of a cluster, and replace it with a new machine, with a zookeeper
>>> server
>>> >> running the same software version and listening on the same IP and
>>> >>ports,
>>> >> then this new server will join the cluster, synchronize and start
>>> >>working
>>> >> normally.
>>> >> I wouldn't recommend to replace more than one server at a time, and I
>>> >> think that it is better if the new server joins while the existing
>>> >>quorum
>>> >> is stable (avoid leader elections while the new server joins, i.e.
>>> avoid
>>> >> restarts or disconnections of the existing servers).
>>> >>
>>> >> Best regards,
>>> >>
>>> >> Germán.
>>> >>
>>> >>
>>> >> On Tue, Nov 5, 2013 at 6:42 AM, Bae, Jae Hyeon <[EMAIL PROTECTED]>
>>> >>wrote:
>>> >>
>>> >>> Hi
>>> >>>
>>> >>> I read an article
>>> >>>
>>> >>>
>>> >>>
>>> http://www.benhallbenhall.com/2011/07/rolling-restart-in-apache-zookeepe
>>> >>>r-to-dynamically-add-servers-to-the-ensemble/
>>> >>>
>>> >>> My question is, even though failed hardware is replaced with the same
>>> >>>IP
>>> >>> address, do I need to do rolling restart for adding replaced hardware
Att.
Diego de Oliveira
System Architect
[EMAIL PROTECTED]
www.diegooliveira.com
Never argue with a fool -- people might not be able to tell the difference