-Re: Our use case and am I right in my definition of Throughput?
Jun Rao 2013-05-17, 15:11
You earlier email seems to focus on latency, not throughput. Typically, you
can either optimize for latency or throughput, but not both. If you want
higher throughput, you should consider using the async mode in the producer
with a larger batch size (e.g., 1000 messages). Using more instances of
producer also increases the overall throughput.
On Fri, May 17, 2013 at 12:26 AM, Kishore V. Kopalle <
[EMAIL PROTECTED]> wrote:
> Hello All,
> Our use case is to display certain aggregates on GUI from a live stream of
> data coming in at more than 100k messages/sec. Will I be able to use Kafka
> for handling at least 100k messages/sec and send it to Twitter Storm for
> aggregate calculations is the question I have. I already know that Storm
> can handle more than 100k messages/sec.
> I have seen slides presented at Apache Con 2011 North America on Kafka and
> the Volume handled on slide 12 is mentioned as 150k events/sec. Can we take
> this as a throughput possible with Kafka at 150k msgs/sec? I am assuming
> Throughput definition as not the messages/sec sent at producer or
> messages/sec received at the consumer, but the number of messages per
> second that can be sent over the wire between consumer and producer. With
> such a definition I am hardly getting 1 message/sec as per my earlier
> mails. I am trying to measure it by calculating the time it takes for a
> single message to go between producer and consumer. If the throughput,
> according to my definition, is 150k messages/sec, the time taken for a
> single message should have been 1/(150k) seconds.