Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka 0.7 performance compared to bare metal


Copy link to this message
-
Re: Kafka 0.7 performance compared to bare metal
Your producer test uses a thread per core. Your consumer test uses a single
thread. A single thread is likely insufficient to get maximum throughput.
On Aug 30, 2013 8:46 AM, "Rafael Bagmanov" <[EMAIL PROTECTED]> wrote:

> Bejamin, do you mean thread on a client side? I'm not quite getting
> what I'm limited with. Can you please explain little bit more?
>
> A single threaded producer is still capable of doing 50 MB/s on
> hi1.4xlarge.
> Which is quite slower than 377 MB/s from single job of FIO. But still
> 5 times faster than what I'm getting from consumer.
> Is it as expected to be?
>
> Another mystery for me is that in case of hot IO cache (whole topic is
> in memory): I'm getting 50 MB/s - 100 MB/s (this huge std. dev. bugs
> me too) from a single threaded consumer.
>
> And when cache is cold, I'm not seeing that kafka broker making best
> possible from SSD it has.
> I've tried setting fetch-size to 100 MB, but still kafka hits disk
> with 10 MB/s. (the disk by itself can satisfy much more read requests
> with same latency and provide much higher throughput).
>
> For me it looks as if
> http://man7.org/linux/man-pages/man2/sendfile.2.html somehow works
> inefficiently with SSD. And I don't understand why and how can this be
> fixed.
>
> I do understand that you advising me to use more partitions and more
> consumer threads. But I would like to know the limits I'm hitting with
> this single threaded mode.
>
> Thanks!
>
> Rafael Bagmanov,
> Grid Dynamics
>
> 2013/8/30 Benjamin Black <[EMAIL PROTECTED]>:
> > You are maxing out the single consumer thread.
> > On Aug 30, 2013 1:35 AM, "Rafael Bagmanov" <[EMAIL PROTECTED]> wrote:
> >
> >> Hi,
> >>
> >> I am trying to understand how fast is kafka 0.7 compared to what I can
> get
> >> from hard drive. In essence I have 3 questions.
> >>
> >> In all tests below, I'm using single broker with single one-partitioned
> >> topic. Kafka perf tests have been run in 2 deployment configs:
> >> - broker, perf-test on same host
> >> - broker, perf-test on different hosts (the results are practically the
> >> same, so wont post them here)
> >>
> >>
> >> I'm using FIO(http://freecode.com/projects/fio) to benchmark speed of
> hard
> >> drives.
> >>
> >> Hardware I'm using:
> >> 1) m1.xlarge with ephemeral storage, 4 core cpu, 16 GB ram
> >> 2) hi1.4xlarge  with SSD, 16 core cpu, 64 GB ram
> >> 3) desktop machine with 7200 rpm sata, 4 core cpu, 8 GB ram
> >>
> >> Kafka broker config:
> >> Oracle jdk 1.6.0_38,  -Xmx2048
> >>
> >> socket.send.buffer=16777216
> >> socket.receive.buffer=16777216
> >> max.socket.request.bytes=104857600
> >> log.flush.interval=10000
> >> log.default.flush.interval.ms=1000
> >> log.default.flush.scheduler.interval.ms=1000
> >> num.threads=[num of cores]
> >>
> >>
> >> For kafka-producer-perf-test I'm assuming that IO access pattern is
> >> sequential write.
> >>
> >> Here is the test I ran with FIO:
> >>
> >> [sequential-write]
> >> rw=write
> >> size=50G
> >> ioengine=sync
> >> numjobs=1
> >> directory=/tmp/fio
> >> filename=redo01.log
> >>
> >>
> >> Here is kafka performance test:
> >>
> >> ./bin/kafka-producer-perf-test.sh -topic "perf" --batch-size 3000
> >> --messages 50000000 --message-size 1300 --brokerinfo
> >> broker.list=0:host:9092 --threads [number-of-cores]
> >>
> >>
> >>
> ----------------------------------------------------------------------------------------
> >> |           |   m1.xlarge            |    hi1.4xlarge       |  desktop
> >>  |
> >>
> >>
>  ----------------------------------------------------------------------------------------
> >> |  kafka  |     41 MB/s           |      217 MB/s       |     42 MB/s
> |
> >>
> >>
>  -----------------------------------------------------------------------------------------
> >> |  fio      |     106 MB/s          |      377 MB/s       |    74 MB/s
>   |
> >>
> >>
> ----------------------------------------------------------------------------------------
> >>
> >>
> >> Question 1: The proportion (~1/2) is pretty stable against different

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB