Hi there... we're currently looking into using Kafka as a pipeline for passing around log messages. We like its use of Zookeeper for coordination (as we already make heavy use of Zookeeper at Nextdoor), but I'm running into one big problem. Everything we do is a) in the cloud, b) secure, and c) cross-region/datacenter/cloud-provider.
We make use of SSL for both encryption and authentication of most of our services. My understanding is that Kafka 0.7.x producers and consumers connect to Zookeeper to retrieve a list of the current Kafka servers, and then make direct TCP connections to the individual servers that they need to to publish or subscribe to a stream. In 0.8.x thats changed, so now clients can connect to a single Kafka server and get a list of these servers via an API?
What I'm wondering is whether we can actually put an ELB in front of *all* of our Kafka servers, throw stunnel on them, and give our producers and clients a single endpoint to connect to (through the ELB) rather than having them connect directly to the individual Kafka servers. This would provide us both encryption of the data during transport, as well as authentication of the producers and subscribers. Lastly, if it works, it would provide these features without impacting our ability to use existing kafka producer/consumers that people have written.
My concern is that the Kafka clients (producers or consumers?) would connect once through the ELB, then get the list of servers via the API, and finally try to connect directly to one of those Kafka servers rather than just leveraging the existing connection through the ELB.