Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - Partitioning and scale


+
Timothy Chen 2013-05-22, 19:26
+
Chris Curtin 2013-05-22, 19:37
+
Neha Narkhede 2013-05-22, 20:15
+
Timothy Chen 2013-05-22, 21:20
+
Neha Narkhede 2013-05-22, 23:32
+
Timothy Chen 2013-05-23, 23:22
Copy link to this message
-
Re: Partitioning and scale
Milind Parikh 2013-05-23, 23:36
Number of files to manage by os, I suppose.

Why wouldn't you use consistent hashing with deliberately engineered
collisions to generate a limited number of topics / partitions and filter
at the consumer level?

Regards
Milind
On May 23, 2013 4:22 PM, "Timothy Chen" <[EMAIL PROTECTED]> wrote:

> Hi Neha,
>
> Not sure if this sounds crazy, but if we'd like to have the events for the
> same session id go to the same partition one way could be that each session
> key creates its own topic with single partition, therefore there could be
> millions of topic with single partition.
>
> I wonder what would be the bottleneck of doing this?
>
> Thanks,
>
> Tim
>
>
> On Wed, May 22, 2013 at 4:32 PM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
>
> > Not automatically as of today. You have to run the reassign-partitions
> tool
> > and explicitly move selected partitions to the new brokers. If you use
> this
> > tool, you can move partitions to the new broker without any downtime.
> >
> > Thanks,
> > Neha
> >
> >
> > On Wed, May 22, 2013 at 2:20 PM, Timothy Chen <[EMAIL PROTECTED]> wrote:
> >
> > > Hi Neha/Chris,
> > >
> > > Thanks for the reply, so if I set a fixed number of partitions and just
> > add
> > > brokers to the broker pool, does it rebalance the load to the new
> brokers
> > > (along with the data)?
> > >
> > > Tim
> > >
> > >
> > > On Wed, May 22, 2013 at 1:15 PM, Neha Narkhede <
> [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > - I see that Kafka server.properties allows one to specify the number
> > of
> > > > partitions it supports. However, when we want to scale I wonder if we
> > > add #
> > > > of partitions or # of brokers, will the same partitioner start
> > > distributing
> > > > the messages to different partitions?
> > > >  And if it does, how can that same consumer continue to read off the
> > > > messages of those ids if it was interrupted in the middle?
> > > >
> > > > The num.partitions config in server.properties is used only for
> topics
> > > that
> > > > are auto created (controlled by auto.create.topics.enable). For
> topics
> > > that
> > > > you create using the admin tool, you can specify the number of
> > partitions
> > > > that you want. After that, currently there is no way to change that.
> > For
> > > > that reason, it is a good idea to over partition your topic, which
> also
> > > > helps load balance partitions onto the brokers. You are right that if
> > you
> > > > change the number of partitions later, then previously messages that
> > > stuck
> > > > to a certain partition would now get routed to a different partition,
> > > which
> > > > is undesirable for applications that want to use sticky partitioning.
> > > >
> > > > - I'd like to create a consumer per partition, and for each one to
> > > > subscribe to the changes of that one. How can this be done in kafka?
> > > >
> > > > For your use case, it seems like SimpleConsumer might be a better
> fit.
> > > > However, it will require you to write code to handle discovery of
> > leader
> > > > for the partition that your consumer is consuming. Chris has written
> > up a
> > > > great example that you can follow -
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
> > > >
> > > > Thanks,
> > > > Neha
> > > >
> > > >
> > > > On Wed, May 22, 2013 at 12:37 PM, Chris Curtin <
> [EMAIL PROTECTED]
> > > > >wrote:
> > > >
> > > > > Hi Tim,
> > > > >
> > > > >
> > > > > On Wed, May 22, 2013 at 3:25 PM, Timothy Chen <[EMAIL PROTECTED]>
> > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm currently trying to understand how Kafka (0.8) can scale with
> > our
> > > > > usage
> > > > > > pattern and how to setup the partitioning.
> > > > > >
> > > > > > We want to route the same messages belonging to the same id to
> the
> > > same
> > > > > > queue, so its consumer will able to consume all the messages of
> > that
> > > > id.
> > > > > >
> > > > > > My questions:
> > > > > >

 
+
Neha Narkhede 2013-05-24, 15:40