Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Thousands of topics


+
Lorenzo Alberton 2012-07-30, 15:39
+
Taylor Gautier 2012-07-31, 19:58
Copy link to this message
-
Re: Thousands of topics
Hi Taylor,

thanks for your reply. I'd love to read your blog post about your
experiences with it, especially around hardware configuration and how you
consume the data (few/many short/long-lived processes, average throughput
per topic). The cleanup script seems really useful too, I was considering
writing one that also cleans dead topics off zookeeper.

Thanks!

Lorenzo
On Tue, Jul 31, 2012 at 8:58 PM, Taylor Gautier <[EMAIL PROTECTED]> wrote:

> Yes, we have done so at Tagged.  I chronicled a bit of our experience here
> on the the mailing list.  Effectively we found that a single machine could
> not go above ~20k total topics.  This could be OS dependent however (we use
> CentOS 5.x)
>
> Various tweaks we made to go further:
>
>    1. a beefed up node.js kafka client/producer implementation -
>    https://github.com/tagged/node-kafka lies at the heart of our kafka
>    deployment
>    2. our own kafka software load balancer (implemented using said library)
>    that shards out independent Kafka instances (guarantees in-order
> delivery
>    per topic and scales the # of kafka topics linearly as a function of
> the #
>    of kafka machines)
>    3. a continuous cleaner that removes old dead topics completely from the
>    filesystem (0.7 cleaner leaves empty directory/file which eats up open
> file
>    handles and limits max # of topics)
>    4. (coming soon) a hierarchical topic directory structure to ease the
>    pain of too main directories/files in a single directory (should help
> the
>    ~20k number, though probably by less than you might imagine)
>
> On our todo list is blogging about this in more detail, and contributing
> back more than just the node.js implementation.
>
> On Mon, Jul 30, 2012 at 8:39 AM, Lorenzo Alberton <[EMAIL PROTECTED]
> >wrote:
>
> > Is there anyone who tried Kafka with thousands of concurrent topics?
> > If so, what are your experiences? How did you tune it?
> >
> > Thanks!
> >
>
+
Johan Rydberg 2012-08-14, 05:17