Ah, good question we really should add this to the documentation.
We run a cluster per data center. All writes always go to the data-center
local cluster. Replication to aggregate clusters that provide the "world
wide" view is done with mirror maker.
It is also fine to write to or read from a kafka cluster in a remote colo,
though obviously you have to think about the case where the cluster is not
accessible due to network access.
Kafka is not designed to run a single cluster spread across geographically
disparate colos and you would see a few problems in that scenario. The
first is that, as you noted, the latency will be terrible as it will block
on the slowest response from all datacenters. This could be avoided if you
lowered the request.required.acks to 1, but that would impact durability
guarantees. The second problem is that Kafka will not remain available in
the presence of network partitions so if the inter-datacenter link failed
one datacenter would lose its cluster. Finally we have not done anything to
attempt to optimize partition placement by colo so you would not actually
have redundancy between colos because we would often place all replicas in
a single colo.
On Tue, Jul 9, 2013 at 9:34 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> Our application has multiple producers globally (region1, region2,
> region3). If we group all the brokers together into one cluster, we notice
> an obvious network latency if a broker replicates regionally with the
> request.required.acks = -1.
> Is there any best practice for combating the network latency in the
> deployment topology? Should we segregate the brokers regionally (one kafka
> cluster per region) and set up MirrorMaker between the regions (region1
> <--> region2, region2 <--> region3, region1 <--> region3), total of 6
> mirror makes?