Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Arguments for Kafka over RabbitMQ ?

Copy link to this message
Re: Arguments for Kafka over RabbitMQ ?
Hi, Alex,

Thanks for sharing your thoughts here. Your understand of Kafka is correct
and your analysis is very helpful. Just a couple of followup questions on

1. How do people typically scale out RabbitMQ? In Kafka, we have this
notion of a cluster that can include multiple brokers. Brokers in a cluster
are managed through Zookeeper.
2. How does RabbitMQ support HA? In the upcoming Kafka 0.8 release, we are
adding the capability of storing a message redundantly in multiple brokers
in a cluster for achieving both higher durability and availability. I am
wondering if RabbitMQ is doing something similar already.


On Fri, Jun 7, 2013 at 5:54 AM, Alexis Richardson <[EMAIL PROTECTED]>wrote:

> Hi
> Alexis from Rabbit here.  I hope I am not intruding!
> It would be super helpful if people with questions, observations or
> moans posted them to the rabbitmq list too :-)
> A few comments:
> * Along with ZeroMQ, I consider Kafka to be one of the interesting and
> useful messaging projects out there.  In a world of cruft, Kafka is
> cool!
> * This is because both projects come at messaging from a specific
> point of view that is *different* from Rabbit.  OTOH, many other
> projects exist that replicate Rabbit features for fun, or NIH, or due
> to misunderstanding the semantics (yes, our docs could be better)
> * It is striking how few people describe those differences.  In a
> nutshell they are as follows:
> *** Kafka writes all incoming data to disk immediately, and then
> figures out who sees what.  So it is much more like a database than
> Rabbit, in that new consumers can appear well after the disk write and
> still subscribe to past messages.  Instead, Rabbit which tries to
> deliver to consumers and buffers otherwise.  Persistence is optional
> but robust and a feature of the buffer ("queue") not the upstream
> machinery.  Rabbit is able to cache-on-arrival via a plugin, but this
> is a design overlay and not particularly optimal.
> *** Kafka is a client server system with end to end semantics.  It
> defines order to include processing order, and keeps state on the
> client to do this.  Group management is via a 3rd party service
> (Zookeeper? I forget which).  Rabbit is a server-only protocol based
> system which maintains order on the server and through completely
> language neutral protocol semantics.  This makes Rabbit perhaps more
> natural as a 'messaging service' eg for integration and other
> inter-app data transfer.
> *** Rabbit is a general purpose messaging system with extras like
> federation.  It speaks many protocols, and has core features like HA,
> transactions, management, etc.  Everything can be switched on or off.
> Getting all this to work while keeping the install light and fast, is
> quite fiddly.  Kafka by contrast comes from a specific set of use
> cases, which are interesting certainly.  I am not sure if Kafka wants
> to be a general purpose messaging system, but it will become a bit
> more like Rabbit if that is the goal.
> *** Both approaches have costs.  In the case of Rabbit the cost is
> that more metadata is stored on the broker.  Kafka can get performance
> gains by storing less such data.  But we are talking about some N
> thousands of MPS versus some M thousands.  At those speeds the clients
> are usually the bottleneck anyway.
> * Let me also clarify some things:
> *** Rabbit does NOT store multiple copies of the same message across
> queues, unless they are very small (<60b, iirc).  A message delivered
> to >1 queue on 1 machine is stored once.  Metadata about that message
> may be stored more than once, but, at scale, the big cost is the
> payload.
> *** Rabbit's vanilla install does store some index data in memory when
> messages flow to disk.  You can change this by using a plugin, but
> this is a secret-menu undocumented feature.  Very very few people need
> any such thing.
> *** A Rabbit queue is lightweight.  It's just an ordered consumption
> buffer that can persist and ack.  Don't assume things about Rabbit