-Re: Kafka Cluster Failover
Joel Koshy 2013-11-27, 20:30
> We could use LVS or some other load balancer/proxy for the Kafka
> connections, and automatically switch between clusters based on
> availability. But, what would this do to live producers and their
> metadata? Would they be able to handle a total switch of cluster
This should be fine - if your remote DC Kafka broker goes down, the
producer should re-issue metadata requests through the load balancer
which (based on my understanding of your topology) should then go to
the main DC's Kafka cluster. The producer will then establish
connections to the main DC's brokers for subsequent sends. (I recall
from earlier in the list that you are using librdkafka - it should be
doing something similar though.)
I'm a bit unclear on your set up - by non-HA broker do you mean non-HA
by virtue of it being a single broker with no replication? You would
still need to get it registered in a ZooKeeper cluster right? Also,
where are the events going to be ultimately consumed? I'm assuming in
the main DC - in which case you would anyway need to ship your Kafka
logs from the remote DC to the main DC correct?
On Wed, Nov 27, 2013 at 12:47:01PM -0500, Andrew Otto wrote:
> Wikimedia is close to using Kafka to collect webrequest access logs from multiple data centers. I know that MirrorMaker is the recommended way to do cross-DC Kafka, but this is a lot of overhead for our remote DCs. To set up a highly available Kafka Cluster, we need to add a few more nodes in each DC (brokers and zookeepers). Our remote DCs are used mainly for frontend web caching, and we'd like to keep them that way. We don't want to have to add multiple nodes to each DC just for log delivery.
> We are attempting to produce messages from the remote DCs directly to our main DC's Kafka cluster, but we are worried about data loss during potential times of high latency or link packet loss (we actually had this problem last weekend). Most of the time this works, but it isn't reliable.
> Would it be possible to somehow set up a single non-HA Kafka Broker in our remote DC and produce to that, but then failover to cross-DC production to our main DC Kafka Cluster?
> We could use LVS or some other load balancer/proxy for the Kafka connections, and automatically switch between clusters based on availability. But, what would this do to live producers and their metadata? Would they be able to handle a total switch of cluster metadata?
> -Andrew Otto