Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Zookeeper >> mail # dev >> Performing no downtime hardware changes to a live zookeeper cluster


+
Neha Narkhede 2011-12-20, 20:14
+
Camille Fournier 2011-12-20, 20:26
+
Ted Dunning 2011-12-20, 21:06
+
Neha Narkhede 2011-12-22, 01:43
+
Camille Fournier 2011-12-22, 03:21
+
Neha Narkhede 2012-01-09, 18:15
+
Camille Fournier 2012-01-09, 18:31
+
Neha Narkhede 2012-01-09, 18:51
+
Camille Fournier 2012-01-09, 19:04
+
Neha Narkhede 2012-01-09, 20:33
Copy link to this message
-
Re: Performing no downtime hardware changes to a live zookeeper cluster
Sounds fine with me, probably should make it a flaggable option.

C
On Mon, Jan 9, 2012 at 3:33 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:

> >> If you just have machine names in a list that you pass in, then yes, we
> could re-resolve on every reconnect and you could just re-alias that name
> to a new IP. But you'll have to put in logic that will do that but not
> break people using DNS RR.
>
> Having a list of machine names that can be changed to point to new IPs
> seems reasonable too. To be able to do the upgrade without having to
> restart all clients, besides turning off DNS caching in the JVM, we
> still have to solve the problem of zookeeper client caching the IPs in
> code. Having 2 levels of DNS caching, one in the JVM and one in code
> (which cannot be turned off) doesn't look like a good idea. Unless I'm
> missing the purpose of such IP caching in zookeeper ?
>
> >> I realize that moving machines is difficult when you have lots of
> clients.
> I'm a bit surprised your admins can't maintain machine IP addresses on a
> machine move given a cluster of that complexity, though
>
> Its not like it can't be done, it definitely has quite some
> operational overhead. We are trying to brainstorm various approaches
> and come up with one that will involve the least overhead on such
> upgrades going forward.
>
> Having said that, seems like re-resolving host names in reconnect
> doesn't look like a bad idea, provided it doesn't break the DNS RR use
> case. If that sounds good, can I go ahead a file a JIRA for this ?
>
> Thanks,
> Neha
>
> On Mon, Jan 9, 2012 at 11:04 AM, Camille Fournier <[EMAIL PROTECTED]>
> wrote:
> > We don't shuffle IPs after the initial resolution of IP addresses.
> >
> > In DNS RR, you resolve to a list of IPs, shuffle these, and then we round
> > robin through them trying to connect. If you re-resolve on every
> > round-robin, you have to put in logic to know which ones have changed and
> > somehow maintain that shuffle order or you aren't doing a fair back end
> > round robin, which people using the ZK client against DNS RR are relying
> on
> > today.
> >
> > If you just have machine names in a list that you pass in, then yes, we
> > could re-resolve on every reconnect and you could just re-alias that name
> > to a new IP. But you'll have to put in logic that will do that but not
> > break people using DNS RR.
> >
> > I realize that moving machines is difficult when you have lots of
> clients.
> > I'm a bit surprised your admins can't maintain machine IP addresses on a
> > machine move given a cluster of that complexity, though. I also think
> that
> > if we're going to be putting special cases like this in we might just
> want
> > to go all the way to a pluggable reconnection scheme, but maybe that is
> too
> > aggressive.
> >
> > C
> >
> > On Mon, Jan 9, 2012 at 1:51 PM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
> >
> >> Maybe I didn't express myself clearly. When I said DNS RR, I meant its
> >> simplest implementation which resolves a hostname to multiple IPs.
> >>
> >> Whatever method you use to map host names to IPs, the problem is that
> >> the zookeeper client code will always cache the IPs. So to be able to
> >> swap out a machine, all clients would have to be restarted, which if
> >> you have 100s of clients, is a major pain. If you want to move the
> >> entire cluster to new machines, this becomes even harder.
> >>
> >> I don't see why re-resolving host names to IPs in the reconnect logic
> >> is a problem for zookeeper, since you shuffle the list of IPs anyways.
> >>
> >> Thanks,
> >> Neha
> >>
> >>
> >> On Mon, Jan 9, 2012 at 10:31 AM, Camille Fournier <[EMAIL PROTECTED]>
> >> wrote:
> >> > You can't sensibly round robin within the client code if you
> re-resolve
> >> on
> >> > every reconnect, if you're using dns rr. If that's your goal you'd
> want a
> >> > list of dns alias names and re-resolve each hostname when you hit it
> on
> >> > reconnect. But that will break people using dns rr.
+
Alexander Shraer 2012-01-09, 22:23
+
Ted Dunning 2012-01-09, 23:17
+
Patrick Hunt 2012-01-10, 01:36
+
Neha Narkhede 2012-01-10, 02:49
+
Patrick Hunt 2012-01-10, 16:39
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB