What version of Kafka is this?
In general our throughput will scale linearly with the number of machines
or more specifically the number of disks. Our bottleneck will really be
with the number of partitions. With thousands of partitions leader election
can get slower (seconds), and if you have consumers that consume all
partitions the rebalancing in these consumers can get slow (minutes).
We hope to fix these issues but that is the current state up through 0.8.
On Fri, Aug 2, 2013 at 2:27 PM, Scott Arthur <[EMAIL PROTECTED]> wrote:
> I have a question about scaling the broker count of a Kafka cluster. We
> have a scenario where we'll have two clusters replicating data into a
> third. We're wondering how we should size that third cluster so that it
> can handle the volume of messages from the two source clusters. Should we
> just make the number of brokers match? e.g. five brokers in the two source
> clusters, therefore 10 in the destination cluster. In general, what is the
> horizontal scaling model we should use? Also, is there an upper limit to
> the number of brokers you should put in a cluster, after which you get
> diminishing returns on throughput?
> Scott Arthur