Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Arguments for Kafka over RabbitMQ ?


Copy link to this message
-
Re: Arguments for Kafka over RabbitMQ ?
Nice of you to reply Alexis and clarify things! FWIW, I personally like
Rabbit very much and I am pushing to use it for other purposes at my
company. It's flexibility, ease of use and even documentation is really top
if you compare with other options.

I might have explained some of my points a bit too rapidly and your
clarifications are much better than what I could do. I think the message
duplication is where I possibly made a false assumption but it wasn't a
decisive factor in our case. It might have been a setup issue also, not
sure. In any way, I don't think here is the place to discuss that.

On the other hand, as a user, I hope Kafka will not try to become a general
purpose messaging system because that's the reason I opted to use it.
On Fri, Jun 7, 2013 at 8:54 AM, Alexis Richardson <[EMAIL PROTECTED]>wrote:

> Hi
>
> Alexis from Rabbit here.  I hope I am not intruding!
>
> It would be super helpful if people with questions, observations or
> moans posted them to the rabbitmq list too :-)
>
> A few comments:
>
> * Along with ZeroMQ, I consider Kafka to be one of the interesting and
> useful messaging projects out there.  In a world of cruft, Kafka is
> cool!
>
> * This is because both projects come at messaging from a specific
> point of view that is *different* from Rabbit.  OTOH, many other
> projects exist that replicate Rabbit features for fun, or NIH, or due
> to misunderstanding the semantics (yes, our docs could be better)
>
> * It is striking how few people describe those differences.  In a
> nutshell they are as follows:
>
> *** Kafka writes all incoming data to disk immediately, and then
> figures out who sees what.  So it is much more like a database than
> Rabbit, in that new consumers can appear well after the disk write and
> still subscribe to past messages.  Instead, Rabbit which tries to
> deliver to consumers and buffers otherwise.  Persistence is optional
> but robust and a feature of the buffer ("queue") not the upstream
> machinery.  Rabbit is able to cache-on-arrival via a plugin, but this
> is a design overlay and not particularly optimal.
>
> *** Kafka is a client server system with end to end semantics.  It
> defines order to include processing order, and keeps state on the
> client to do this.  Group management is via a 3rd party service
> (Zookeeper? I forget which).  Rabbit is a server-only protocol based
> system which maintains order on the server and through completely
> language neutral protocol semantics.  This makes Rabbit perhaps more
> natural as a 'messaging service' eg for integration and other
> inter-app data transfer.
>
> *** Rabbit is a general purpose messaging system with extras like
> federation.  It speaks many protocols, and has core features like HA,
> transactions, management, etc.  Everything can be switched on or off.
> Getting all this to work while keeping the install light and fast, is
> quite fiddly.  Kafka by contrast comes from a specific set of use
> cases, which are interesting certainly.  I am not sure if Kafka wants
> to be a general purpose messaging system, but it will become a bit
> more like Rabbit if that is the goal.
>
> *** Both approaches have costs.  In the case of Rabbit the cost is
> that more metadata is stored on the broker.  Kafka can get performance
> gains by storing less such data.  But we are talking about some N
> thousands of MPS versus some M thousands.  At those speeds the clients
> are usually the bottleneck anyway.
>
> * Let me also clarify some things:
>
> *** Rabbit does NOT store multiple copies of the same message across
> queues, unless they are very small (<60b, iirc).  A message delivered
> to >1 queue on 1 machine is stored once.  Metadata about that message
> may be stored more than once, but, at scale, the big cost is the
> payload.
>
> *** Rabbit's vanilla install does store some index data in memory when
> messages flow to disk.  You can change this by using a plugin, but
> this is a secret-menu undocumented feature.  Very very few people need

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB