Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Consumer throughput imbalance


+
Ian Friedman 2013-08-24, 16:59
+
Neha Narkhede 2013-08-25, 14:46
+
Ian Friedman 2013-08-25, 16:01
+
Mark 2013-08-25, 16:14
+
Ian Friedman 2013-08-25, 17:17
+
Ian Friedman 2013-08-25, 17:24
+
Jay Kreps 2013-08-25, 19:12
+
Ian Friedman 2013-08-26, 05:02
+
Jay Kreps 2013-08-26, 17:27
+
Ian Friedman 2013-08-26, 18:19
+
Jay Kreps 2013-08-26, 18:38
Copy link to this message
-
Re: Consumer throughput imbalance
Got it, thanks Jay

--
Ian Friedman
On Monday, August 26, 2013 at 2:37 PM, Jay Kreps wrote:

> Yes exactly.
>
> Lowering queuedchunks.max shouldn't help if the problem is what I
> described. That options controls how many chunks the consumer has ready in
> memory for processing. But we are hypothesisizing that your problem is
> actually that the individual chunks are just too large leading to the
> consumer spending a long time processing from one partition before it gets
> the next chunk.
>
> -Jay
>
>
> On Mon, Aug 26, 2013 at 11:18 AM, Ian Friedman <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> wrote:
>
> > Just to make sure i have this right, on the producer side we'd set
> > max.message.size and then on the consumer side we'd set fetch.size? I
> > admittedly didn't research how all the tuning options would affect us,
> > thank you for the info. Would queuedchunks.max have any effect?
> >
> > --
> > Ian Friedman
> >
> >
> > On Monday, August 26, 2013 at 1:26 PM, Jay Kreps wrote:
> >
> > > Yeah it is always equal to the fetch size. The fetch size needs to be at
> > > least equal to the max message size you have allowed on the server,
> > >
> >
> > though.
> > >
> > > -Jay
> > >
> > >
> > > On Sun, Aug 25, 2013 at 10:00 PM, Ian Friedman <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]) (mailto:
> > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
> > >
> > > > Jay - is there any way to control the size of the interleaved chunks?
> > The
> > > > performance hit would likely be negligible for us at the moment.
> > > >
> > > > --
> > > > Ian Friedman
> > > >
> > > >
> > > > On Sunday, August 25, 2013 at 3:11 PM, Jay Kreps wrote:
> > > >
> > > > > I'm still a little confused by your description of the problem. It
> > might
> > > > be
> > > > > easier to understand if you listed out the exact things you have
> > > >
> > > >
> > > > measured,
> > > > > what you saw, and what you expected to see.
> > > > >
> > > > > Since you mentioned the consumer I can give a little info on how that
> > > > > works. The consumer consumes from all the partitions it owns
> > > > > simultaneously. The behavior is that we interleve fetched data
> > > > >
> > > >
> > > >
> > >
> >
> > chunks of
> > > > > messages from each partition the consumer is processing. The chunk
> > > >
> > >
> >
> > size
> > > > >
> > > >
> > > >
> > > > is
> > > > > controlled by the fetch size set in the consumer. So the behavior you
> > > >
> > > >
> > > > would
> > > > > expect is that you would get a bunch of messages from one partition
> > > > > followed by a bunch from another partition. The reason for doing this
> > > > > instead of, say, interleving individual messages is that it is a big
> > > > > performance boost--making every message an entry in a blocking queue
> > > > >
> > > >
> > > >
> > > > gives
> > > > > a 5x performance hit in high-throughput cases. Perhaps this
> > > >
> > > >
> > >
> >
> > interleaving
> > > >
> > > > is
> > > > > the problem?
> > > > >
> > > > > -Jay
> > > > >
> > > > >
> > > > > On Sun, Aug 25, 2013 at 10:22 AM, Ian Friedman <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])(mailto:
> > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])) (mailto:
> > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
> > > > >
> > > > > > Sorry I reread what I've written so far and found that it doesn't
> > state
> > > > > > the actual problem very well. Let me clarify once again:
> > > > > >
> > > > > > The problem we're trying to solve is that we can't let messages go
> > for
> > > > > > unbounded amounts of time without getting processed, and it seems
> > > > >
> > > >
> > >
> >
> > that
> > > > > > something about what we're doing (which I suspect is the fact that
> > > > > > consumers own several partitions but only consume from one of them
> > > > > >
> > > > >
> > > >
> > >
> >
> > at a
> > > > > > time until it's caught up) is causing a small number of them to sit
> > > > >
> > > >
> > > >
> > > > around
> > > > > > for hours and hours. This is despite some consumers idling due to

 
+
Ian Friedman 2013-08-26, 18:34
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB