Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> Performing no downtime hardware changes to a live zookeeper cluster


Copy link to this message
-
Re: Performing no downtime hardware changes to a live zookeeper cluster
We don't shuffle IPs after the initial resolution of IP addresses.

In DNS RR, you resolve to a list of IPs, shuffle these, and then we round
robin through them trying to connect. If you re-resolve on every
round-robin, you have to put in logic to know which ones have changed and
somehow maintain that shuffle order or you aren't doing a fair back end
round robin, which people using the ZK client against DNS RR are relying on
today.

If you just have machine names in a list that you pass in, then yes, we
could re-resolve on every reconnect and you could just re-alias that name
to a new IP. But you'll have to put in logic that will do that but not
break people using DNS RR.

I realize that moving machines is difficult when you have lots of clients.
I'm a bit surprised your admins can't maintain machine IP addresses on a
machine move given a cluster of that complexity, though. I also think that
if we're going to be putting special cases like this in we might just want
to go all the way to a pluggable reconnection scheme, but maybe that is too
aggressive.

C

On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:

> Maybe I didn't express myself clearly. When I said DNS RR, I meant its
> simplest implementation which resolves a hostname to multiple IPs.
>
> Whatever method you use to map host names to IPs, the problem is that
> the zookeeper client code will always cache the IPs. So to be able to
> swap out a machine, all clients would have to be restarted, which if
> you have 100s of clients, is a major pain. If you want to move the
> entire cluster to new machines, this becomes even harder.
>
> I don't see why re-resolving host names to IPs in the reconnect logic
> is a problem for zookeeper, since you shuffle the list of IPs anyways.
>
> Thanks,
> Neha
>
>
> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <[EMAIL PROTECTED]>
> wrote:
> > You can't sensibly round robin within the client code if you re-resolve
> on
> > every reconnect, if you're using dns rr. If that's your goal you'd want a
> > list of dns alias names and re-resolve each hostname when you hit it on
> > reconnect. But that will break people using dns rr.
> > You can look into writing a pluggable reconnect logic into the zk client,
> > that's what would be required to do this but at the end of the day you'll
> > have to give your users special clients to make that work.
> >
> > C
> >  On Jan 9, 2012 1:16 PM, "Neha Narkhede" <[EMAIL PROTECTED]>
> wrote:
> >
> >> I was reading through the client code and saw that zookeeper client
> >> caches the server IPs during startup and maintains it for the rest of
> >> its lifetime. If we go with the DNS RR approach or a load balancer
> >> approach, and later swap out a server with a new one ( with a new IP
> >> ), all clients would have to be restarted to be able to "forget" the
> >> old IP and see the new one. That doesn't look like a clean approach to
> >> such upgrades. One way of getting around this problem, is adding the
> >> resolution of host names to IPs in the "reconnect" logic in addition
> >> to the constructor. So when such upgrades happen and the client
> >> reconnects, it will see the new list of IPs, and wouldn't require to
> >> be restarted.
> >>
> >> Does this approach sound good or am I missing something here ?
> >>
> >> Thanks,
> >> Neha
> >>
> >> On Wed, Dec 21, 2011 at 7:21 PM, Camille Fournier <[EMAIL PROTECTED]>
> >> wrote:
> >> > DNS RR is good. I had good experiences using that for my client
> >> > configs for exactly the reasons you are listing.
> >> >
> >> > On Wed, Dec 21, 2011 at 8:43 PM, Neha Narkhede <
> [EMAIL PROTECTED]>
> >> wrote:
> >> >> Thanks for the responses!
> >> >>
> >> >>>> How are your clients configured to find the zks now?
> >> >>
> >> >> Our clients currently use the list of hostnames and ports that
> >> >> comprise the zookeeper cluster. For example,
> >> >> zoo1:port1,zoo2:port2,zoo3:port3
> >> >>
> >> >>>> > - switch DNS,
> >> >>> - wait for caches to die,
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB