Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Replication across Multiple Datacenters


Copy link to this message
-
Re: Replication across Multiple Datacenters
Joel Koshy 2013-06-24, 08:54
I don't think replication is ideal for creating single clusters spanning
DCs for at least a couple reasons: the replica assignment strategy is
currently not rack or DC-aware although that can be addressed by manually
creating topics. Also, network glitches and latencies which are more likely
in a cross-DC link could result in more frequent and prolonged periods of
under-replication, higher latencies in the controller-broker RPCs, etc. A
better approach would be to set up a mirror cluster - see
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring+%28MirrorMaker%29(needs
a few updates for 0.8)

Also, for your other question on the producer: a producer can only deliver
messages to the leader of a partition.

Thanks,

Joel

On Sun, Jun 23, 2013 at 6:10 PM, Mark Farnan <[EMAIL PROTECTED]> wrote:

> Howdy,
>
> Is the replication factor system in  Kafka 0.8 suitable for creating
> single clusters which span across data centers ?  (up to 3)
>
> I am looking for a system where I don't loose messages, and can
> effectively 'fail over' to a different datacenter for processing if/when
> the primary goes down.   If I read correctly,  any message delivered to a
> Kafka broker, will be copied to its Replica/s.  And a producer could
> deliver messages to any broker in the same replica set.
>
> Is that correct ?
>
>
> I am aware there are several zookeeper issues around multi DC support
> which I need to sort out,   so this question is specific for the Kafka
> portion.
>
> Note: My main consumer from Kafka will be STORM.
>
> Regard
>
> Mark.
>
>
>
>
>