LinkedIn uses the first method for cross DC mirroring. For the second
method, there are 2 main issues. (1) Kafka depends on the ZK service to be
always available. For a ZK cluster to be available, you need a majority of
ZK servers to be up. If you set up a ZK cluster spanning only 2 data
centers, a single DC failure may make the ZK cluster unavailable. You can
set up a ZK cluster spanning 3 or more DCs, which allows you tolerate at
least 1 DC failure. (2) Long network latency across DCs. In order for the
follow to keep up with the leader in a different DC, you need to tune
parameters like replica.lag.max.messages,,
and replica.socket.receive.buffer.bytes to amortize the long network


On Sat, Jun 29, 2013 at 10:50 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB