Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> seeing poor consumer performance in 0.7.2


Copy link to this message
-
Re: seeing poor consumer performance in 0.7.2
The only other thing being written to these disks is log4j (kafka.out), so
technically it is not dedicated to the data logs. The disks are 250GB SATA.
On Fri, Apr 26, 2013 at 6:35 PM, Neha Narkhede <[EMAIL PROTECTED]>wrote:

> >    - Decreased num.partitions and log.flush.interval on the brokers from
> >    64/10k to 32/100 in order to lower the average flush time (we were
> >    previously always hitting the default flush interval since no
> partitions
>
> Hmm, that is a pretty low value for flush interval leading to higher disk
> usage. Do you use dedicated disks for kafka data logs ? Also what sort of
> disks do you use ?
>
> Thanks,
> Neha
>
> >
> >
> > On Tue, Apr 23, 2013 at 7:53 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >
> > > You can run kafka.tools.ConsumerOffsetChecker to check the consumer
> lag. If
> > > the consumer is lagging, this indicates a problem on the consumer side.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Mon, Apr 22, 2013 at 9:13 PM, Andrew Neilson <[EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Hmm it is highly unlikely that that is the culprit... There is lots
> of
> > > > bandwidth available for me to use. I will definitely keep that in
> mind
> > > > though. I was working on this today and have some tidbits of
> additional
> > > > information and thoughts that you might be able to shed some light
> on:
> > > >
> > > >    - I mentioned I have 2 consumers, but each consumer is running
> with 8
> > > >    threads for this topic (and each consumer has 8 cores available).
> > > >    - When I initially asked for help the brokers were configured with
> > > >    num.partitions=1, I've since tried higher numbers (3, 64) and
> haven't
> > > > seen
> > > >    much of an improvement aside from forcing both consumer apps to
> handle
> > > >    messages (with the overall performance not changing much).
> > > >    - I ran into this article
> > > >
> > > >
> > >
>
> http://riccomini.name/posts/kafka/2012-10-05-kafka-consumer-memory-tuning/and
> > > > tried a variety of options of queuedchunks.max and fetch.size with no
> > > >    significant results (simply meaning it did not achieve the goal of
> me
> > > >    constantly processing hundreds or thousands of messages per
> second,
> > > > which
> > > >    is similar to the rate of input). I would not be surprised if I'm
> > > wrong
> > > > but
> > > >    this made me start to think that the problem may lie outside of
> the
> > > >    consumers
> > > >    - Would the combination of a high number of partitions (64) and a
> high
> > > >    log.flush.interval (10k) prevent logs from flushing as often as
> they
> > > > need
> > > >    to for my desired rate of consumption (even with
> > > >    log.default.flush.interval.ms=1000?)
> > > >
> > > > Despite the changes I mentioned the behaviour is still the consumers
> > > > receiving larger spikes of messages mixed with periods of complete
> > > > inactivity and overall a long delay between messages being written
> and
> > > > messages being read (about 2 minutes). Anyway... as always I greatly
> > > > appreciate any help.
> > > >
> > > > On Sun, Apr 21, 2013 at 8:50 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > Is your network shared? Is so, another possibility is that some
> other
> > > > apps
> > > > > are consuming the bandwidth.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Sun, Apr 21, 2013 at 12:23 PM, Andrew Neilson <
> [EMAIL PROTECTED]
> > > > > >wrote:
> > > > >
> > > > > > Thanks very much for the reply Neha! So I swapped out the
> consumer
> > > that
> > > > > > processes the messages with one that just prints them. It does
> indeed
> > > > > > achieve a much better rate at peaks but can still nearly zero out
> (if
> > > > not
> > > > > > completely zero out). I plotted the messages printed in graphite
> to
> > > > show
> > > > > > the behaviour I'm seeing (this is messages printed per second):
> > > > > >
> > > > > >
> > > > > >
> > >