Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Partitioning and scale


+
Timothy Chen 2013-05-22, 19:26
Copy link to this message
-
Re: Partitioning and scale
Hi Tim,
On Wed, May 22, 2013 at 3:25 PM, Timothy Chen <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm currently trying to understand how Kafka (0.8) can scale with our usage
> pattern and how to setup the partitioning.
>
> We want to route the same messages belonging to the same id to the same
> queue, so its consumer will able to consume all the messages of that id.
>
> My questions:
>
>  - From my understanding, in Kafka we would need to have a custom
> partitioner that routes the same messages to the same partition right?  I'm
> trying to find examples of writing this partitioner logic, but I can't find
> any. Can someone point me to an example?
>
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example

The partitioner here does a simple mod on the IP address and the # of
partitions. You'd need to define your own logic, but this is a start.
> - I see that Kafka server.properties allows one to specify the number of
> partitions it supports. However, when we want to scale I wonder if we add #
> of partitions or # of brokers, will the same partitioner start distributing
> the messages to different partitions?
>  And if it does, how can that same consumer continue to read off the
> messages of those ids if it was interrupted in the middle?
>

I'll let someone else answer this.
>
> - I'd like to create a consumer per partition, and for each one to
> subscribe to the changes of that one. How can this be done in kafka?
>

Two ways: Simple Consumer or Consumer Groups:

Depends on the level of control you want on code processing a specific
partition vs. getting one assigned to it (and level of control over offset
management).

https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example

https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example<https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example>
>
> Thanks,
>
> Tim
>

 
+
Neha Narkhede 2013-05-22, 20:15
+
Timothy Chen 2013-05-22, 21:20
+
Neha Narkhede 2013-05-22, 23:32
+
Timothy Chen 2013-05-23, 23:22
+
Milind Parikh 2013-05-23, 23:36
+
Neha Narkhede 2013-05-24, 15:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB