Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Mirroring datacenters without vpn

Copy link to this message
Re: Mirroring datacenters without vpn
Joel Koshy 2014-01-11, 01:09
> Ops proposed to set up mirror to work over open internet channel without
> secured vpn. Security of this particular data is not a concern and, as I
> understood, it will give us more bandwidth (unless we buy some extra
> hardware, lot's of internal details there).
> Is this configuration possible at all? Have anyone tried/using such
> configuration? I'd appreciate any feedback.
> Major source of confusion is how MirrorMaker/other producers would handle
> external names for the brokers. As I understand, producer connects to the
> broker in the configuration only to bootstrap (get list of all available
> brokers), and after that talks to the brokers received during
> bootstrapping. So local clients won't work (or will route to external
> interface) if I configure brokers to use external names. Remote clients
> won't work if internal names configured.
> Is there some reasonable way to configure kafka to support such scenario?

Would this feature help in your case:
i.e., you can configure the broker to publish a separate hostname to
zookeeper which is what the producers should use when actually sending
data. So you would need to override the advertised.host.name and port

> Also, should I run MirrorMaker in the same DC as central kafka cluster or
> multiple MirrorMakers in remote DCs?
> Any description of how it is setup in your case is helpful. Do you use vpn
> between DCs? Where do you run MirrorMaker - in central dc or in remote and
> why?

We generally run the mirror-maker in the target data center. i.e., we
do a remote consume but local produce. If you have a flaky connection
between the two clusters the consumers may encounter hit session
expirations and rebalance and reduce the overall throughput. You can
also do local consumption and remote produce although we have not
tried that. In either case you will need to set a high socket buffer
to help amortize the high network latencies.