Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Combating network latency best practice


Copy link to this message
-
Re: Combating network latency best practice
Jay Kreps 2013-07-10, 16:05
To publish to a remote data center just configure the producers with the
host/port of the remote datacenter. To ensure good throughput you may want
to tune the socket send and receive buffers on the client and server to
avoid small roundtrips:
http://en.wikipedia.org/wiki/Bandwidth-delay_product

-Jay

On Wed, Jul 10, 2013 at 6:57 AM, Calvin Lei <[EMAIL PROTECTED]> wrote:

> Thanks Jay. I thought of using the worldview architecture you suggested.
> But since our consumers are also globally deployed, which means any new
> messages arrive the worldview needs to be replicated back to the local DCs,
> making the topology a bit complicated.
>
> Would you please elaborate on the remote write? How do I achieve it?
> On Jul 10, 2013 1:08 AM, "Jay Kreps" <[EMAIL PROTECTED]> wrote:
>
> > Ah, good question we really should add this to the documentation.
> >
> > We run a cluster per data center. All writes always go to the data-center
> > local cluster. Replication to aggregate clusters that provide the "world
> > wide" view is done with mirror maker.
> >
> > It is also fine to write to or read from a kafka cluster in a remote
> colo,
> > though obviously you have to think about the case where the cluster is
> not
> > accessible due to network access.
> >
> > Kafka is not designed to run a single cluster spread across
> geographically
> > disparate colos and you would see a few problems in that scenario. The
> > first is that, as you noted, the latency will be terrible as it will
> block
> > on the slowest response from all datacenters. This could be avoided if
> you
> > lowered the request.required.acks to 1, but that would impact durability
> > guarantees. The second problem is that Kafka will not remain available in
> > the presence of network partitions so if the inter-datacenter link failed
> > one datacenter would lose its cluster. Finally we have not done anything
> to
> > attempt to optimize partition placement by colo so you would not actually
> > have redundancy between colos because we would often place all replicas
> in
> > a single colo.
> >
> > -Jay
> >
> >
> > On Tue, Jul 9, 2013 at 9:34 PM, Calvin Lei <[EMAIL PROTECTED]> wrote:
> >
> > > Folks,
> > >    Our application has multiple producers globally (region1, region2,
> > > region3). If we group all the brokers together into one cluster, we
> > notice
> > > an obvious network latency if a broker replicates regionally with the
> > > request.required.acks = -1.
> > >
> > >    Is there any best practice for combating the network latency in the
> > > deployment topology? Should we segregate the brokers regionally (one
> > kafka
> > > cluster per region) and set up MirrorMaker between the regions (region1
> > > <--> region2, region2 <--> region3, region1 <--> region3), total of 6
> > > mirror makes?
> > >
> > >
> > > Thanks.
> > >
> >
>