High availability backend services via zookeeper or TCP load balancer
Hi, I am new to ZK and pls forgive me my question below is stupid :)

We have custom written servers (not public facing, only called by our
internal system) which is distributed (TCP based, share nothing) that is
currently in AWS and with the help of ELB TCP based load balancing, it is
somehow fault-tolerant and we are happy with that.

Now, we need to move off from AWS to save cost as our traffic grow.

The problem is, now we need to maintain our own load balancers and we need
to make it fault-tolerant (unlike ELB is built-in), the
expected technologies would be haproxy, keepalived.

While I am thinking this setup, I am thinking why not use ZK instead? Why
not maintain the currently available servers list in ZK, my initial
algorithms for the internal clients would be:

1. Get the latest server list from ZK
2. Hash the server list and pick one of the backend (load balancing part)
3. Call it
4. If it fail, update the ZK and increment the error count
5. If the error count reached a threshold and remove the backend from the
server list
6. So the other clients would not see the backend with error
7. Flush the error count so the backend would have a chance to active again

Is my algorithm above valid? Any caveat when using with ZK?

Looking for your comment, thanks.