Kafka, mail # user - Large number of Partitions - 2013-03-19, 22:17
Solr & Elasticsearch trainings in New York & San Francisco [more info][hide]
 Search Hadoop and all its subprojects:

Switch to Plain View
Tom Amon 2013-12-31, 21:44
Copy link to this message
Large number of Partitions
Hello, I am a complete newbie to Kafka and am trying to evaluate its usefulness for our particular application.  I plan to have a lot of consumers in a single group, and it seems like the best way to load balance messages across consumers without knowing ahead of time exactly how many consumers you will have is to simply have a large number of partitions so that each consumer takes a significant portion of them.

This line of thought has lead me to the following question: What is the potential overhead from creating a large number of partitions on a topic?

We would probably have at least 300 consumers to start and want to have room to grow. And I'd probably want a number of partitions about 5x the number of consumers to absorb consumers dying/being brought online. Should I be concerned about this in terms of memory overhead or performance?

Ian Friedman
Jun Rao 2013-03-20, 04:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB