Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Partitioning and scale

Copy link to this message
Re: Partitioning and scale
Chris Curtin 2013-05-22, 19:37
Hi Tim,
On Wed, May 22, 2013 at 3:25 PM, Timothy Chen <[EMAIL PROTECTED]> wrote:

> Hi,
> I'm currently trying to understand how Kafka (0.8) can scale with our usage
> pattern and how to setup the partitioning.
> We want to route the same messages belonging to the same id to the same
> queue, so its consumer will able to consume all the messages of that id.
> My questions:
>  - From my understanding, in Kafka we would need to have a custom
> partitioner that routes the same messages to the same partition right?  I'm
> trying to find examples of writing this partitioner logic, but I can't find
> any. Can someone point me to an example?
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example

The partitioner here does a simple mod on the IP address and the # of
partitions. You'd need to define your own logic, but this is a start.
> - I see that Kafka server.properties allows one to specify the number of
> partitions it supports. However, when we want to scale I wonder if we add #
> of partitions or # of brokers, will the same partitioner start distributing
> the messages to different partitions?
>  And if it does, how can that same consumer continue to read off the
> messages of those ids if it was interrupted in the middle?

I'll let someone else answer this.
> - I'd like to create a consumer per partition, and for each one to
> subscribe to the changes of that one. How can this be done in kafka?

Two ways: Simple Consumer or Consumer Groups:

Depends on the level of control you want on code processing a specific
partition vs. getting one assigned to it (and level of control over offset


> Thanks,
> Tim