howard chen 2013-02-26, 17:36
Jordan Zimmerman 2013-02-26, 18:27
Camille Fournier 2013-02-26, 18:35
howard chen 2013-02-27, 09:43
kishore g 2013-02-27, 16:01
-Re: High availability backend services via zookeeper or TCP load balancer
I think that Camille's points are well founded.
However, I have just had two years experience with the opposite approach in
which we use ZK as a way of doing application level load balancing and I
really have come to like it.
The situation is that the job tracker in MapR is a Hadoop Job Tracker
except that it talks to ZK to allow it to be restarted on a different
machine in bad circumstances. All of the task trackers talk to this job
tracker by watching for leader election changes.
We could have used haproxy or something similar here, but we have ZK
already in the system and adding another layer of indirection is an
unpleasant thought. More important, however, is the fact that changes in
leadership are handled at the application level by closing the connection
to the old server and opening a new connection to the new server. If the
switch occurred automagically and asynchronously, it would be a bit harder
to get the code correct.
We also use application level load balancing across multiple NIC's in the
file server portion of our system although without the mediation of ZK in
that case. In that case, we actually stripe RPC requests and replies
across all available connections. This gives us robustness and a
non-trivial bandwidth gain. In this case, an external load balancer would
not work because this load balancing can't reliably be done at the TCP
level due to packet ordering constraints. At the application level,
however, it is easily done because we can make use of multiple TCP
THE net result is that your mileage will vary.
On average, I would strongly consider Camille's advice. For special
applications, application level balancing can be fantastically good.
On Tue, Feb 26, 2013 at 10:35 AM, Camille Fournier <[EMAIL PROTECTED]>wrote:
> You can definitely use ZK for this, as Jordan said. I would really question
> whether writing client-side code to do this vs using something that is
> really designed for writing load balancers (like haproxy) wouldn't be a
> better way to do it however. It doesn't sound like you are creating
> long-lived connections between these clients and services, and instead just
> want to send a request to an ip address that corresponds to the LB for that
> request. Your client-side code is probably going to be buggier and the
> setup/maintenance more complex than if you use a simple load balancer. If
> you're already using ZK for a lot of other things and it is really baked in
> to all your clients, maybe this is the easiest thing to do, but I wouldn't
> use ZK just for this purpose.
> On Tue, Feb 26, 2013 at 1:27 PM, Jordan Zimmerman <
> [EMAIL PROTECTED]> wrote:
> > Service Discovery is a good use-case for ZooKeeper. FYI - Curator has an
> > implementation of this already:
> > https://github.com/Netflix/curator/wiki/Service-Discovery
> > -Jordan
> > On Feb 26, 2013, at 9:36 AM, howard chen <[EMAIL PROTECTED]> wrote:
> > > Hi, I am new to ZK and pls forgive me my question below is stupid :)
> > >
> > > We have custom written servers (not public facing, only called by our
> > > internal system) which is distributed (TCP based, share nothing) that
> > > currently in AWS and with the help of ELB TCP based load balancing, it
> > > somehow fault-tolerant and we are happy with that.
> > >
> > > Now, we need to move off from AWS to save cost as our traffic grow.
> > >
> > > The problem is, now we need to maintain our own load balancers and we
> > need
> > > to make it fault-tolerant (unlike ELB is built-in), the
> > > expected technologies would be haproxy, keepalived.
> > >
> > > While I am thinking this setup, I am thinking why not use ZK instead?
> > > not maintain the currently available servers list in ZK, my initial
> > > algorithms for the internal clients would be:
> > >
> > > 1. Get the latest server list from ZK
> > > 2. Hash the server list and pick one of the backend (load balancing
> > > 3. Call it