Kafka, mail # user - Re: Using Kafka for "data" messages - 2013-06-13, 19:13
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
Re: Using Kafka for "data" messages
Ah yes, I had read that Kafka likes under 1,000 topics but I wasn't sure if that was really a limitation.  In principle I wouldn't mind having all guest events placed on the "GUEST_DATA" queue but I thought that by having more topics I could minimize having consumers read messages only to discard them.  My thought had been that if I have 20 Web JVM and at any given time I have 1,000 people logged in per JVM, each JVM would only need to consume the messages from 1,000 topics.  If instead there is a single topic, each JVM will be consuming from the same topic (and be in different consumer groups) but 19 out of 20 messages will be for guests that are not even logged into that JVM.  Since Kafka doesn't have message selectors or anything like that I was hoping to use topics to help segregate the traffic.  I don't want to use 1 topic per Web JVM because in the future other consumers may be interested in that same data and the services that put the data in
 Kafka shouldn't have to lookup what JVM that user is logged into (or get that from another message and keep track of it).  Any thoughts on how to work around this?  I know there are topic partitions but that seems more like a way to distribute the workload in terms of storing the messages and not for the message selection scenario I am describing if I understood correctly.
 From: Timothy Chen <[EMAIL PROTECTED]>
Sent: Thursday, June 13, 2013 2:13 PM
Subject: Re: Using Kafka for "data" messages

Also since you're going to be creating a topic per user, the number of
concurrent users will also be a concern to Kafka as it doesn't like massive
amounts of topics.

On Thu, Jun 13, 2013 at 10:47 AM, Josh Foure <[EMAIL PROTECTED]> wrote:

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB