Currently, partition is the smallest unit that we distribute data among
consumers (in the same consumer group). So, if the # of consumers is larger
than the total number of partitions in a Kafka cluster (across all
brokers), some consumers will never get any data. Such a decision is done
on a per topic basis. If a consumer consumes multiple topics, it would make
sense to divide partitions across all topics to consumers. We haven't done
that yet. Part of the reason is that we need to figure out how to balance
the data across topics since they can be of different sizes. We can look
into that post 0.8.
For now, the solution is to increase the number of partitions on the broker.
On Mon, Jan 7, 2013 at 9:03 AM, Pablo Barrera González <
[EMAIL PROTECTED]> wrote: