Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Num of streams for consumers using TopicFilter.


Copy link to this message
-
Num of streams for consumers using TopicFilter.
So my understanding is num of active streams that a consumer can utilize is
number of partitions in topic. This is fine if we consumer from specific
topic. But if we consumer from TopicFilter, I thought consumer should able
to utilize (number of topics that match filter * number of partitions in
topic) . But looks like number of streams that consumer can use is limited
by just number if partitions in topic although it's consuming from multiple
topic.

Here what I observed with 1 mirrormaker consuming from whitelist '.+'.

The white list matches 5 topics and each topic has 8 partitions. I used
consumer offset checker to look at owner of each/topic partition.

1) When I started mirrormaker with num.streams=8, all topics/partitions are
distributed between 8 consumer threads.

2) When I started mirrormaker with num.streams=16, looks like 16 consumer
threads were created, but only 8 are showing up as active as owner in
consumer offset tracker and all topics/partitions are distributed between 8
consumer threads.

So this could be bottleneck for consumers as although we partitioned topic,
if we are consuming from topic filter it can't utilize much of parallelism
with num of streams. Am i missing something, is there a way to make
cosumers/mirrormakers to utilize more number of active streams?
--
Thanks,
Raja.