Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka cluster with lots of topics


Copy link to this message
-
Re: Kafka cluster with lots of topics
Zookeeper will not be the only problem. The first is that each topic is a
directory on the file system. Each of those is going to have files inside
it. This is going to be fairly overwhelming for the file system. Also I can
not speak for the internals but there may be cases where this many topics
allocates a big array or some other non-optimal behaviour.

Like a RDBMS with this many tables one might ask, why? Isn't there a way to
design the system multi-tennent where so many physical topics are not
needed?
On Wed, Nov 13, 2013 at 9:41 AM, Neha Narkhede <[EMAIL PROTECTED]>wrote:

> At those many topics, zookeeper will be the main bottleneck. Leader
> election process will take very long increasing the unavailability window
> of the cluster.
>
> Thanks,
> Neha
> On Nov 13, 2013 4:49 AM, "Joe Freeman" <[EMAIL PROTECTED]> wrote:
>
> > Would I be correct in assuming that a Kafka cluster won't scale well to
> > support lots (tens of millions) of topics? If I understand correctly, a
> > node being added or removed would involve a leader election for each
> topic,
> > which is a relatively expensive operation?
> >
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB