Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - Analysis of producer performance


+
Piotr Kozikowski 2013-04-08, 23:43
+
Jun Rao 2013-04-09, 04:49
+
Guy Doulberg 2013-04-09, 06:34
Copy link to this message
-
Re: Analysis of producer performance
Piotr Kozikowski 2013-04-09, 17:23
Jun,

Thank you for your comments. I'll reply point by point for clarity.

1. We were aware of the migration tool but since we haven't used Kafka for
production yet we just started using the 0.8 version directly.

2. I hadn't seen those particular slides, very interesting. I'm not sure
we're testing the same thing though. In our case we vary the number of
physical machines, but each one has 10 threads accessing a pool of Kafka
producer objects and in theory a single machine is enough to saturate the
brokers (which our test mostly confirms). Also, assuming that the slides
are based on the built-in producer performance tool, I know that we started
getting very different numbers once we switched to use "real" (actual
production log) messages. Compression may also be a factor in case it
wasn't configured the same way in those tests.

3. In the latency section, there are two tests, one for average and another
for maximum latency. Each one has two graphs presenting the exact same data
but at different levels of zoom. The first one is to observe small
variations of latency when target throughput <= actual throughput. The
second is to observe the overall shape of the graph once latency starts
growing when target throughput > actual throughput. I hope that makes sense.

4. That sounds great, looking forward to it.

Piotr

On Mon, Apr 8, 2013 at 9:48 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> Piotr,
>
> Thanks for sharing this. Very interesting and useful study. A few comments:
>
> 1. For existing 0.7 users, we have a migration tool that mirrors data from
> an 0.7 cluster to an 0.8 cluster. Applications can upgrade to 0.8 by
> upgrading consumers first, followed by producers.
>
> 2. Have you looked at the Kafka ApacheCon slides (
> http://www.slideshare.net/junrao/kafka-replication-apachecon2013)? Towards
> the end, there are some performance numbers too. The figure for throughput
> vs #producer is different from what you have. Not sure if this is because
> that you have turned on compression.
>
> 3. Not sure that I understand the difference btw the first 2 graphs in the
> latency section. What's different btw the 2 tests?
>
> 4. Post 0.8, we plan to improve the producer side throughput by
> implementing non-blocking socket on the client side.
>
> Jun
>
>
> On Mon, Apr 8, 2013 at 4:42 PM, Piotr Kozikowski <[EMAIL PROTECTED]>
> wrote:
>
> > Hi,
> >
> > At LiveRamp we are considering replacing Scribe with Kafka, and as a
> first
> > step we run some tests to evaluate producer performance. You can find our
> > preliminary results here:
> > https://blog.liveramp.com/2013/04/08/kafka-0-8-producer-performance-2/.
> We
> > hope this will be useful for some folks, and If anyone has comments or
> > suggestions about what to do differently to obtain better results your
> > feedback will be very welcome.
> >
> > Thanks,
> >
> > Piotr
> >
>

 
+
Otis Gospodnetic 2013-04-10, 19:05
+
Piotr Kozikowski 2013-04-10, 20:11
+
Yiu Wing TSANG 2013-04-11, 02:47
+
Jun Rao 2013-04-11, 05:18
+
Piotr Kozikowski 2013-04-12, 00:46
+
Jun Rao 2013-04-12, 14:54
+
Piotr Kozikowski 2013-04-12, 23:09
+
Jun Rao 2013-04-15, 01:06
+
Philip OToole 2013-04-12, 15:22
+
S Ahmed 2013-04-12, 15:28
+
Philip OToole 2013-04-12, 15:59
+
Philip OToole 2013-04-12, 17:04
+
Piotr Kozikowski 2013-04-15, 18:19
+
David Arthur 2013-04-23, 12:22