Hello, I am a complete newbie to Kafka and am trying to evaluate its usefulness for our particular application.  I plan to have a lot of consumers in a single group, and it seems like the best way to load balance messages across consumers without knowing ahead of time exactly how many consumers you will have is to simply have a large number of partitions so that each consumer takes a significant portion of them.

This line of thought has lead me to the following question: What is the potential overhead from creating a large number of partitions on a topic?

We would probably have at least 300 consumers to start and want to have room to grow. And I'd probably want a number of partitions about 5x the number of consumers to absorb consumers dying/being brought online. Should I be concerned about this in terms of memory overhead or performance?

Ian Friedman
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB