LinkedIn uses the first method for cross DC mirroring. For the second
method, there are 2 main issues. (1) Kafka depends on the ZK service to be
always available. For a ZK cluster to be available, you need a majority of
ZK servers to be up. If you set up a ZK cluster spanning only 2 data
centers, a single DC failure may make the ZK cluster unavailable. You can
set up a ZK cluster spanning 3 or more DCs, which allows you tolerate at
least 1 DC failure. (2) Long network latency across DCs. In order for the
follow to keep up with the leader in a different DC, you need to tune
parameters like replica.lag.max.messages, replica.lag.time.max.ms,
and replica.socket.receive.buffer.bytes to amortize the long network
On Sat, Jun 29, 2013 at 10:50 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> The first method may lose message if cluster A is permanently down or
> restart right away as B always lags behind A. Even with mirroring, B has
> to wait
> to get missing msg until A is back. So it is not ideal. What type of
> solution did
> you use at linkedin?
> -----Original Message-----
> From: Joel Koshy [mailto:[EMAIL PROTECTED]]
> Sent: Friday, June 28, 2013 8:59 PM
> To: [EMAIL PROTECTED]
> Subject: Re: failover strategy
> The second method (replication across DCs) is not recommended.
> The first set up would work provided the set of topics you are mirroring
> from A->B is disjoint from the set of topics you are mirroring from B->A
> (i.e., to avoid a mirroring loop).
> On Fri, Jun 28, 2013 at 5:29 PM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> > Hi,
> > I can think of two failover strategies. I am not sure which one is the
> right way to go.
> > First method. set up kafka server A on cluster 1 and set up another
> server B on cluster 2.
> > The two clusters are in different data centers. Use customized
> > mirrormaker to sync between the two servers. Use one server in
> > production and use the other one as contingency. If server A is down,
> server B will be used (this can be transparent to publishers/consumers).
> > There may be a lag between the two servers before server A is down .
> > But after A is back, the customized mirrormaker can sync the two. And
> > eventually B will have all the data A had before the failure.
> > Second method. Set up one kafka server using cluster 1 and cluster 2.
> > When creating a topic , always use two replications. For each
> > partition, assign one replication to a broker in cluster 1 and assign
> > the other replication to a broker in cluster 2. So kafka will handle the
> syncing and failover for the two clusters. Is that a right (expected) way
> to use kafka?
> > Regards,
> > Libo