Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Transferring events across data centers


Copy link to this message
-
Re: Transferring events across data centers
It's probably fine to have a remote producer too. You will need to do the
same socket buffer tuning on the producer side to amortize the long network
delay.

Thanks,

Jun

On Sun, Feb 3, 2013 at 10:01 PM, Apoorva Gaurav <[EMAIL PROTECTED]>wrote:

> Thanks Jun.
>
> So we'll have to maintain Zookeepers and Brokers in both the DCs while
> Producers can be in DC1 and Consumers can be in target DC2.
>
> Are there any issues if we keep only Producer in DC1 talking
> to Zookeepers and Brokers in DC2. I've been able to achieve this by making
> a "hostname" entry in Broker properties which will have internal IP in DC2
> and public IP in DC1.
>
> On Mon, Feb 4, 2013 at 10:55 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
> > Apoorva,
> >
> > Kafka replication in 0.8 is designed for a Kafka cluster within the same
> > DC. The following wiki describes cross DC mirroring using the tool
> > MirrorMaker and how to optimize the throughput for long network latency.
> >
> > https://cwiki.apache.org/KAFKA/kafka-mirroring-mirrormaker.html
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Feb 1, 2013 at 6:47 PM, Apoorva Gaurav <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Hello All,
> > >
> > > We are working on Kafka based event collection system. This needs to
> > gather
> > > events from across data centers. Lets say all the events will be
> produced
> > > in DC1 while kafka brokers and consumers are lying in DC2. Round trip
> > > between DC1 and DC2 can be around ~80 ms. Number of events should be
> > around
> > > ~50 million a day, peak being ~5K events a day, data volume ~100GB a
> day,
> > > peak being ~10MB a day. What is the best way to do it.
> > >
> > > --- Is keeping the producer is DC1 sending events to DC2 a good idea.
> > > --- Should my ZK quorum lie only in DC2 or should it spawn across both
> > DC1
> > > and DC2.
> > > --- Will this problem be solved easily in version .8.0 through broker
> > > replication by keeping brokers in both DC1 and DC2.
> > >
> > > --
> > > Thanks & Regards,
> > > Apoorvave
> > >
> >
>
>
>
> --
> Thanks & Regards,
> Apoorva
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB