It seems there are two underlying things here: storing messages to
stable storage, and making messages available to consumers (i.e.,
storing messages on the broker). One can be achieved simply and reliably
by spooling to local disk, the other requires network and is inherently
less reliable. Buffering messages in memory does not help with the first
one since they are in volatile storage, but it does help with the second
one in the event of a network partition.

I could imagine a producer running in "ultra-reliability" mode where it
uses a local log file as a buffer where all messages written to and read
from. One issue with this, though, is that now you have to worry about
the performance and capacity of the disks on your producers (which can
be numerous compared to brokers). As for performance, the data being
written by producers is already in active memory, so writing it to a
disk then doing a zero-copy transfer to the network should be pretty
fast (maybe?).

Or, Kafka can remain more "protocol-ish" and less "application-y" and
just give you errors when brokers are unavailable and let your
application deal with it. This is basically what TCP/HTTP/etc do. HTTP
servers don't say "hold on, there's a problem, let me try that request
again in a second.."

Interesting discussion, btw :)

On 4/15/13 2:18 PM, Piotr Kozikowski wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB