How do people handle situations, and specifically the broker.id property,
where the Kafka (broker) cluster is not fully defined right away?

Here's the use case we have at Sematext:
* Our software ships as a VM
* All components run in this single VM, including 1 Kafka broker
* Of course, this is just for a nice OOTB experience, but to scale one
needs to have more instances of this VM, including more Kafka brokers
* *One can clone our VM and launch N instances of it, but because we have a
single Kafka broker config with a single broker.id <http://broker.id> in
it, one can't just launch more of these VMs and expect to see more Kafka
brokers join the cluster.  One would have to change the broker.id
<http://broker.id> on each new VM instance.*

How do others handle this in a software that is packages and ships to user
and is not in your direct control to allow you to edit configs?

Would it be best to have a script that connect to ZooKeeper to get the list
of all existing brokers and their IDs and then generate a new distinct ID +
config for the new Kafka broker?

Or are there slicker ways to do this that people use?

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/

  Florian Dambrine 2014-11-03, 22:10
  Joel Koshy 2014-11-03, 22:12
  Joe Stein 2014-11-03, 22:18
  Gwen Shapira 2014-11-03, 22:25
  Jay Kreps 2014-11-03, 23:58
  Neha Narkhede 2014-11-04, 21:47
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB