Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Large number of Partitions


Copy link to this message
-
Large number of Partitions
Hello, I am a complete newbie to Kafka and am trying to evaluate its usefulness for our particular application.  I plan to have a lot of consumers in a single group, and it seems like the best way to load balance messages across consumers without knowing ahead of time exactly how many consumers you will have is to simply have a large number of partitions so that each consumer takes a significant portion of them.

This line of thought has lead me to the following question: What is the potential overhead from creating a large number of partitions on a topic?

We would probably have at least 300 consumers to start and want to have room to grow. And I'd probably want a number of partitions about 5x the number of consumers to absorb consumers dying/being brought online. Should I be concerned about this in terms of memory overhead or performance?

Thanks,
Ian Friedman
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB