Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Anyone running kafka with a single broker in production? what about only 8GB ram?


Copy link to this message
-
Re: Anyone running kafka with a single broker in production? what about only 8GB ram?
I'm also curious to know what is the limiting factor of kafka write
throughput?

I've never seen reports higher than 100mb/sec, obviously disks can provide
much more. In my own test with single broker, single partition, single
replica:
bin/kafka-producer-perf-test.sh --topics perf --threads 10 --broker-list
10.80.42.154:9092 --messages 5000000 --message-size 3000
It tops around 90MB/sec. Cpu, disk, memory, network, everything is
chilling, but still I can't get higher numbers.
On Fri, Oct 11, 2013 at 11:17 AM, Bruno D. Rodrigues <
[EMAIL PROTECTED]> wrote:

> Producer:
>         props.put("batch.num.messages", "1000"); // 200
>         props.put("queue.buffering.max.messages", "20000"); // 10000
>         props.put("request.required.acks", "0");
>         props.put("producer.type", "async"); // sync
>
>         // return ++this.count % a_numPartitions; // just round-robin
>         props.put("partitioner.class", "main.SimplePartitioner"); //
> kafka.producer.DefaultPartitioner
>
>         // disabled = 70MB source, 70MB network, enabled = 70MB source,
> ~40-50MB network
>         props.put("compression.codec", "Snappy"); // none
>
> Consumer is with default settings, as I test separately without any
> consumer at all, and then test the extra load of having 1..n consumers. I
> assume the top speed would be without consumers at all. I'm measuring both
> the produced messages as well as the consumer side.
>
> On the kafka server I've changed the following, expecting less disk writes
> at the cost of loosing messages:
>
> #log.flush.interval.messages=10000
> log.flush.interval.messages=10000000
> #log.flush.interval.ms=1000
> log.flush.interval.ms=10000
> #log.segment.bytes=536870912
> # is signed int 32, only up to 2^31-1!
> log.segment.bytes=2000000000
> #log.retention.hours=168
> log.retention.hours=1
>
>
> Basically I need high throughput of discardable messages, so having them
> persisted temporarily on the disk, in an highly optimised manner like Kafka
> shows, would be great not for the reliability (not loosing messages), but
> because it would allow me to get some previous messages even if the client
> (kafka client or real consumer client) disconnects, as well as providing a
> way to go back in time some seconds if needed.
>
>
>
> A 11/10/2013, às 18:56, Magnus Edenhill <[EMAIL PROTECTED]> escreveu:
>
> Make sure the fetch batch size and the local consumer queue sizes are large
> enough, setting them too low will limit your throughput to the
> broker<->client latency.
>
> This would be controlled using the following properties:
> - fetch.message.max.bytes
> - queued.max.message.chunks
>
> On the producer side you would want to play with:
> - queue.buffering.max.ms and .messages
> - batch.num.messages
>
> Memory on the broker should only affect disk cache performance, the more
> the merrier of course, but it depends on your use case, with a bit of luck
> the disk caches are already hot for the data you are reading (e.g.,
> recently produced).
>
> Consuming millions of messages per second on quad core i7 with 8 gigs of
> RAM is possible without a sweat, given the disk caches are hot.
>
>
> Regards,
> Magnus
>
>
> 2013/10/11 Bruno D. Rodrigues <[EMAIL PROTECTED]>
>
>
> On Thu, Oct 10, 2013 at 3:57 PM, Bruno D. Rodrigues <
> [EMAIL PROTECTED]> wrote:
>
> My personal newbie experience, which is surely completely wrong and
> miss-configured, got me up to 70MB/sec, either with controlled 1K
>
> messages
>
> (hence 70Kmsg/sec) as well as with more random data (test data from 100
> bytes to a couple MB). First I thought the 70MB were the hard disk
>
> limit,
>
> but when I got the same result both with a proper linux server with a
>
> 10K
>
> disk, as well as with a Mac mini with a 5400rpm disk, I got confused.
>
> The mini has 2G, the linux server has 8 or 16, can'r recall at the
>
> moment.
>
>
> The test was performed both with single and multi producers and
>
> consumers.
>
> One producer = 70MB, two producers = 35MB each and so forth. Running

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB