Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - Num of streams for consumers using TopicFilter.


+
Rajasekar Elango 2013-08-29, 15:03
+
Neha Narkhede 2013-08-29, 16:42
+
Rajasekar Elango 2013-08-29, 19:44
+
Jun Rao 2013-08-30, 03:42
Copy link to this message
-
Re: Num of streams for consumers using TopicFilter.
Rajasekar Elango 2013-08-30, 03:52
Hi Jun,

If you read my previous posts, based on current re balancing logic, if we
consumer from topic filter, consumer actively use all streams. Can you
provide your recommendation of option 1 vs option 2 in my previous post?

Thanks,
Raja.
On Thu, Aug 29, 2013 at 11:42 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> You can always use more partitions to get more parallelism in the
> consumers.
>
> Thanks,
>
> Jun
>
>
> On Thu, Aug 29, 2013 at 12:44 PM, Rajasekar Elango
> <[EMAIL PROTECTED]>wrote:
>
> > So what is best way to load balance multiple consumers consuming from
> topic
> > filter.
> >
> > Let's say we have 4 topics with 8 partitions and 2 consumers.
> >
> > Option 1) To load balance consumers, we can set num.streams=4 so that
> both
> > consumers split 8 partitions. but can only use half of consumer streams.
> >
> > Option 2) Configure mutually exclusive topic filter regex such that 2
> > topics will match consumer1 and 2 topics will match consumer2. Now we can
> > set num.streams=8 and fully utilize consumer streams. I believe this will
> > improve performance, but if consumer dies, we will not get any data from
> > the topic used by that consumer.
> >
> > What would be your recommendation?
> >
> > Thanks,
> > Raja.
> >
> >
> > On Thu, Aug 29, 2013 at 12:42 PM, Neha Narkhede <[EMAIL PROTECTED]
> > >wrote:
> >
> > > >> 2) When I started mirrormaker with num.streams=16, looks like 16
> > > consumer
> > > threads were created, but only 8 are showing up as active as owner in
> > > consumer offset tracker and all topics/partitions are distributed
> > between 8
> > > consumer threads.
> > >
> > > This is because currently the consumer rebalancing process of assigning
> > > partitions to consumer streams is at a per topic level. Unless you have
> > at
> > > least one topic with 16 partitions, the remaining 8 threads will not do
> > any
> > > work. This is not ideal and we want to look into a better rebalancing
> > > algorithm. Though it is a big change and we prefer doing it as part of
> > the
> > > consumer client rewrite.
> > >
> > > Thanks,
> > > Neha
> > >
> > >
> > > On Thu, Aug 29, 2013 at 8:03 AM, Rajasekar Elango <
> > [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > So my understanding is num of active streams that a consumer can
> > utilize
> > > is
> > > > number of partitions in topic. This is fine if we consumer from
> > specific
> > > > topic. But if we consumer from TopicFilter, I thought consumer should
> > > able
> > > > to utilize (number of topics that match filter * number of partitions
> > in
> > > > topic) . But looks like number of streams that consumer can use is
> > > limited
> > > > by just number if partitions in topic although it's consuming from
> > > multiple
> > > > topic.
> > > >
> > > > Here what I observed with 1 mirrormaker consuming from whitelist
> '.+'.
> > > >
> > > > The white list matches 5 topics and each topic has 8 partitions. I
> used
> > > > consumer offset checker to look at owner of each/topic partition.
> > > >
> > > > 1) When I started mirrormaker with num.streams=8, all
> topics/partitions
> > > are
> > > > distributed between 8 consumer threads.
> > > >
> > > > 2) When I started mirrormaker with num.streams=16, looks like 16
> > consumer
> > > > threads were created, but only 8 are showing up as active as owner in
> > > > consumer offset tracker and all topics/partitions are distributed
> > > between 8
> > > > consumer threads.
> > > >
> > > > So this could be bottleneck for consumers as although we partitioned
> > > topic,
> > > > if we are consuming from topic filter it can't utilize much of
> > > parallelism
> > > > with num of streams. Am i missing something, is there a way to make
> > > > cosumers/mirrormakers to utilize more number of active streams?
> > > >
> > > >
> > > > --
> > > > Thanks,
> > > > Raja.
> > > >
> > >
> >
> >
> >
> > --
> > Thanks,
> > Raja.
> >
>

--
Thanks,
Raja.

 
+
Jun Rao 2013-08-30, 04:02
+
Rajasekar Elango 2013-08-30, 04:09
+
Jun Rao 2013-08-30, 14:34
+
Rajasekar Elango 2013-08-30, 15:24
+
Jun Rao 2013-08-31, 03:41
+
Jason Rosenberg 2013-10-02, 20:29
+
Jun Rao 2013-10-03, 04:24
+
Jason Rosenberg 2013-10-03, 04:43
+
Jun Rao 2013-10-03, 14:25
+
Jason Rosenberg 2013-10-03, 14:57
+
Jason Rosenberg 2013-10-03, 15:12