Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Anyone running kafka with a single broker in production? what about only 8GB ram?


Copy link to this message
-
Re: Anyone running kafka with a single broker in production? what about only 8GB ram?
I'm also curious to know what is the limiting factor of kafka write
throughput?

I've never seen reports higher than 100mb/sec, obviously disks can provide
much more. In my own test with single broker, single partition, single
replica:
bin/kafka-producer-perf-test.sh --topics perf --threads 10 --broker-list
10.80.42.154:9092 --messages 5000000 --message-size 3000
It tops around 90MB/sec. Cpu, disk, memory, network, everything is
chilling, but still I can't get higher numbers.
On Fri, Oct 11, 2013 at 11:17 AM, Bruno D. Rodrigues <
[EMAIL PROTECTED]> wrote:

> Producer:
>         props.put("batch.num.messages", "1000"); // 200
>         props.put("queue.buffering.max.messages", "20000"); // 10000
>         props.put("request.required.acks", "0");
>         props.put("producer.type", "async"); // sync
>
>         // return ++this.count % a_numPartitions; // just round-robin
>         props.put("partitioner.class", "main.SimplePartitioner"); //
> kafka.producer.DefaultPartitioner
>
>         // disabled = 70MB source, 70MB network, enabled = 70MB source,
> ~40-50MB network
>         props.put("compression.codec", "Snappy"); // none
>
> Consumer is with default settings, as I test separately without any
> consumer at all, and then test the extra load of having 1..n consumers. I
> assume the top speed would be without consumers at all. I'm measuring both
> the produced messages as well as the consumer side.
>
> On the kafka server I've changed the following, expecting less disk writes
> at the cost of loosing messages:
>
> #log.flush.interval.messages=10000
> log.flush.interval.messages=10000000
> #log.flush.interval.ms=1000
> log.flush.interval.ms=10000
> #log.segment.bytes=536870912
> # is signed int 32, only up to 2^31-1!
> log.segment.bytes=2000000000
> #log.retention.hours=168
> log.retention.hours=1
>
>
> Basically I need high throughput of discardable messages, so having them
> persisted temporarily on the disk, in an highly optimised manner like Kafka
> shows, would be great not for the reliability (not loosing messages), but
> because it would allow me to get some previous messages even if the client
> (kafka client or real consumer client) disconnects, as well as providing a
> way to go back in time some seconds if needed.
>
>
>
> A 11/10/2013, às 18:56, Magnus Edenhill <[EMAIL PROTECTED]> escreveu:
>
> Make sure the fetch batch size and the local consumer queue sizes are large
> enough, setting them too low will limit your throughput to the
> broker<->client latency.
>
> This would be controlled using the following properties:
> - fetch.message.max.bytes
> - queued.max.message.chunks
>
> On the producer side you would want to play with:
> - queue.buffering.max.ms and .messages
> - batch.num.messages
>
> Memory on the broker should only affect disk cache performance, the more
> the merrier of course, but it depends on your use case, with a bit of luck
> the disk caches are already hot for the data you are reading (e.g.,
> recently produced).
>
> Consuming millions of messages per second on quad core i7 with 8 gigs of
> RAM is possible without a sweat, given the disk caches are hot.
>
>
> Regards,
> Magnus
>
>
> 2013/10/11 Bruno D. Rodrigues <[EMAIL PROTECTED]>
>
>
> On Thu, Oct 10, 2013 at 3:57 PM, Bruno D. Rodrigues <
> [EMAIL PROTECTED]> wrote:
>
> My personal newbie experience, which is surely completely wrong and
> miss-configured, got me up to 70MB/sec, either with controlled 1K
>
> messages
>
> (hence 70Kmsg/sec) as well as with more random data (test data from 100
> bytes to a couple MB). First I thought the 70MB were the hard disk
>
> limit,
>
> but when I got the same result both with a proper linux server with a
>
> 10K
>
> disk, as well as with a Mac mini with a 5400rpm disk, I got confused.
>
> The mini has 2G, the linux server has 8 or 16, can'r recall at the
>
> moment.
>
>
> The test was performed both with single and multi producers and
>
> consumers.
>
> One producer = 70MB, two producers = 35MB each and so forth. Running