Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka cluster with lots of topics

Copy link to this message
Re: Kafka cluster with lots of topics
Thanks for the replies. I don't think Kafka quite fits our use case,
unfortunately. To abstractly answer Edward's question: in a system with
lots of users, we were considering having a topic per user (such that an
individual user can connect from a number of endpoints and receive events,
including events that were sent while the user was disconnected -
persisting the events to disk and using offsets means we don't have to
track which events each individual endpoint has received).

On 14 November 2013 04:38, Edward Capriolo <[EMAIL PROTECTED]> wrote:

> Zookeeper will not be the only problem. The first is that each topic is a
> directory on the file system. Each of those is going to have files inside
> it. This is going to be fairly overwhelming for the file system. Also I can
> not speak for the internals but there may be cases where this many topics
> allocates a big array or some other non-optimal behaviour.
> Like a RDBMS with this many tables one might ask, why? Isn't there a way to
> design the system multi-tennent where so many physical topics are not
> needed?
> On Wed, Nov 13, 2013 at 9:41 AM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
> > At those many topics, zookeeper will be the main bottleneck. Leader
> > election process will take very long increasing the unavailability window
> > of the cluster.
> >
> > Thanks,
> > Neha
> > On Nov 13, 2013 4:49 AM, "Joe Freeman" <[EMAIL PROTECTED]> wrote:
> >
> > > Would I be correct in assuming that a Kafka cluster won't scale well to
> > > support lots (tens of millions) of topics? If I understand correctly, a
> > > node being added or removed would involve a leader election for each
> > topic,
> > > which is a relatively expensive operation?
> > >
> >

Bitroot - http://bitroot.com