Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> seeing poor consumer performance in 0.7.2


Copy link to this message
-
Re: seeing poor consumer performance in 0.7.2
Oh... and at this point I'm talking about consumers that do no processing
and don't even produce any output. They simply send udp packets to graphite.
On Mon, Apr 22, 2013 at 9:13 PM, Andrew Neilson <[EMAIL PROTECTED]>wrote:

> Hmm it is highly unlikely that that is the culprit... There is lots of
> bandwidth available for me to use. I will definitely keep that in mind
> though. I was working on this today and have some tidbits of additional
> information and thoughts that you might be able to shed some light on:
>
>    - I mentioned I have 2 consumers, but each consumer is running with 8
>    threads for this topic (and each consumer has 8 cores available).
>    - When I initially asked for help the brokers were configured with
>    num.partitions=1, I've since tried higher numbers (3, 64) and haven't seen
>    much of an improvement aside from forcing both consumer apps to handle
>    messages (with the overall performance not changing much).
>    - I ran into this article
>    http://riccomini.name/posts/kafka/2012-10-05-kafka-consumer-memory-tuning/and tried a variety of options of queuedchunks.max and fetch.size with no
>    significant results (simply meaning it did not achieve the goal of me
>    constantly processing hundreds or thousands of messages per second, which
>    is similar to the rate of input). I would not be surprised if I'm wrong but
>    this made me start to think that the problem may lie outside of the
>    consumers
>    - Would the combination of a high number of partitions (64) and a high
>    log.flush.interval (10k) prevent logs from flushing as often as they need
>    to for my desired rate of consumption (even with
>    log.default.flush.interval.ms=1000?)
>
> Despite the changes I mentioned the behaviour is still the consumers
> receiving larger spikes of messages mixed with periods of complete
> inactivity and overall a long delay between messages being written and
> messages being read (about 2 minutes). Anyway... as always I greatly
> appreciate any help.
>
> On Sun, Apr 21, 2013 at 8:50 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
>> Is your network shared? Is so, another possibility is that some other apps
>> are consuming the bandwidth.
>>
>> Thanks,
>>
>> Jun
>>
>>
>> On Sun, Apr 21, 2013 at 12:23 PM, Andrew Neilson <[EMAIL PROTECTED]
>> >wrote:
>>
>> > Thanks very much for the reply Neha! So I swapped out the consumer that
>> > processes the messages with one that just prints them. It does indeed
>> > achieve a much better rate at peaks but can still nearly zero out (if
>> not
>> > completely zero out). I plotted the messages printed in graphite to show
>> > the behaviour I'm seeing (this is messages printed per second):
>> >
>> >
>> >
>> https://www.dropbox.com/s/7u7uyrefw6inetu/Screen%20Shot%202013-04-21%20at%2011.44.38%20AM.png
>> >
>> > The peaks are over ten thousand per second and the troughs can go below
>> 10
>> > per second just prior to another peak. I know that there are plenty of
>> > messages available because the ones currently being processed are still
>> > from Friday afternoon, so this may or may not have something to do with
>> > this pattern.
>> >
>> > Is there anything I can do to avoid the periods of lower performance?
>> > Ideally I would be processing messages as soon as they are written.
>> >
>> >
>> > On Sun, Apr 21, 2013 at 8:49 AM, Neha Narkhede <[EMAIL PROTECTED]
>> > >wrote:
>> >
>> > > Some of the reasons a consumer is slow are -
>> > > 1. Small fetch size
>> > > 2. Expensive message processing
>> > >
>> > > Are you processing the received messages in the consumer ? Have you
>> > > tried running console consumer for this topic and see how it performs
>> > > ?
>> > >
>> > > Thanks,
>> > > Neha
>> > >
>> > > On Sun, Apr 21, 2013 at 1:59 AM, Andrew Neilson <[EMAIL PROTECTED]
>> >
>> > > wrote:
>> > > > I am currently running a deployment with 3 brokers, 3 ZK, 3
>> producers,
>> > 2
>> > > > consumers, and 15 topics. I should first point out that this is my