Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Our use case and am I right in my definition of Throughput?

Kishore V. Kopalle 2013-05-17, 07:26
Jun Rao 2013-05-17, 15:11
Copy link to this message
Re: Our use case and am I right in my definition of Throughput?
Another parameter that you want to tune if you want higher throughput in
0.7 is the broker side flush interval. If you set it to something high,
like 50K or 100K, you can maximize the throughput achieved by your producer.

On Fri, May 17, 2013 at 8:11 AM, Jun Rao <[EMAIL PROTECTED]> wrote:

> You earlier email seems to focus on latency, not throughput. Typically, you
> can either optimize for latency or throughput, but not both. If you want
> higher throughput, you should consider using the async mode in the producer
> with a larger batch size (e.g., 1000 messages). Using more instances of
> producer also increases the overall throughput.
> Thanks,
> Jun
> On Fri, May 17, 2013 at 12:26 AM, Kishore V. Kopalle <
> > Hello All,
> >
> > Our use case is to display certain aggregates on GUI from a live stream
> of
> > data coming in at more than 100k messages/sec. Will I be able to use
> Kafka
> > for handling at least 100k messages/sec and send it to Twitter Storm for
> > aggregate calculations is the question I have. I already know that Storm
> > can handle more than 100k messages/sec.
> >
> > I have seen slides presented at Apache Con 2011 North America on Kafka
> and
> > the Volume handled on slide 12 is mentioned as 150k events/sec. Can we
> take
> > this as a throughput possible with Kafka at 150k msgs/sec? I am assuming
> > Throughput definition as not the messages/sec sent at producer or
> > messages/sec received at the consumer, but the number of messages per
> > second that can be sent over the wire between consumer and producer. With
> > such a definition I am hardly getting 1 message/sec as per my earlier
> > mails. I am trying to measure it by calculating the time it takes for a
> > single message to go between producer and consumer. If the throughput,
> > according to my definition, is 150k messages/sec, the time taken for a
> > single message should have been 1/(150k) seconds.
> >
> >
> > Regards,
> > Kishore
> >