Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Trade-off between topics and partitions?


Copy link to this message
-
Re: Trade-off between topics and partitions?
Deja vu!

IMO, what you are describing is a database problem, even though you are
talking/thinking about it as a queue problem. I'm sure you could construct
something using Kafka (and Samza), but I think you'd have an easier time
with a database. The number of pending messages per user and the average
message size would be critical in selecting exactly which sort of database
to use.

My $0.02.

On Thu, Dec 5, 2013 at 7:47 PM, mission mission <[EMAIL PROTECTED]>wrote:

> Hello,
>
> According to the Kafka FAQ "How do I choose the number of partitions for a
> topic", clusters with more than 10K partitions are not tested. I am looking
> for advice on how to scale the number of partitions beyond that. My use
> case is to publish messages to 1 million users, each with an unique user
> id. Users are not always connected but a user must receive published
> messages in order.
>
> What is the best way to divide topics and partitions for this use case? Do
> I need 1 million partitions? The FAQ seems to think so, i.e. "if we were
> storing notifications for users we would encourage a design with a single
> notifications topic partitioned by user id". But the FAQ implies strongly
> that 1 million partitions may wreak havoc on zookeeper because they will
> lead to X million znodes that have to be stored in memory. Any suggestions?
>
> Thanks,
>
> mission
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB