Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Kafka cluster with lots of topics


+
Joe Freeman 2013-11-13, 12:49
+
Neha Narkhede 2013-11-13, 14:42
+
hsy541@...> 2013-11-13, 19:50
+
Neha Narkhede 2013-11-13, 22:55
+
Edward Capriolo 2013-11-14, 04:38
Copy link to this message
-
Re: Kafka cluster with lots of topics
Thanks for the replies. I don't think Kafka quite fits our use case,
unfortunately. To abstractly answer Edward's question: in a system with
lots of users, we were considering having a topic per user (such that an
individual user can connect from a number of endpoints and receive events,
including events that were sent while the user was disconnected -
persisting the events to disk and using offsets means we don't have to
track which events each individual endpoint has received).

On 14 November 2013 04:38, Edward Capriolo <[EMAIL PROTECTED]> wrote:

> Zookeeper will not be the only problem. The first is that each topic is a
> directory on the file system. Each of those is going to have files inside
> it. This is going to be fairly overwhelming for the file system. Also I can
> not speak for the internals but there may be cases where this many topics
> allocates a big array or some other non-optimal behaviour.
>
> Like a RDBMS with this many tables one might ask, why? Isn't there a way to
> design the system multi-tennent where so many physical topics are not
> needed?
>
>
> On Wed, Nov 13, 2013 at 9:41 AM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
>
> > At those many topics, zookeeper will be the main bottleneck. Leader
> > election process will take very long increasing the unavailability window
> > of the cluster.
> >
> > Thanks,
> > Neha
> > On Nov 13, 2013 4:49 AM, "Joe Freeman" <[EMAIL PROTECTED]> wrote:
> >
> > > Would I be correct in assuming that a Kafka cluster won't scale well to
> > > support lots (tens of millions) of topics? If I understand correctly, a
> > > node being added or removed would involve a leader election for each
> > topic,
> > > which is a relatively expensive operation?
> > >
> >
>

--
Bitroot - http://bitroot.com

 
+
Robert Rodgers 2013-11-18, 19:57
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB