Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Arguments for Kafka over RabbitMQ ?


Copy link to this message
-
Re: Arguments for Kafka over RabbitMQ ?
Nice of you to reply Alexis and clarify things! FWIW, I personally like
Rabbit very much and I am pushing to use it for other purposes at my
company. It's flexibility, ease of use and even documentation is really top
if you compare with other options.

I might have explained some of my points a bit too rapidly and your
clarifications are much better than what I could do. I think the message
duplication is where I possibly made a false assumption but it wasn't a
decisive factor in our case. It might have been a setup issue also, not
sure. In any way, I don't think here is the place to discuss that.

On the other hand, as a user, I hope Kafka will not try to become a general
purpose messaging system because that's the reason I opted to use it.
On Fri, Jun 7, 2013 at 8:54 AM, Alexis Richardson <[EMAIL PROTECTED]>wrote:

> Hi
>
> Alexis from Rabbit here.  I hope I am not intruding!
>
> It would be super helpful if people with questions, observations or
> moans posted them to the rabbitmq list too :-)
>
> A few comments:
>
> * Along with ZeroMQ, I consider Kafka to be one of the interesting and
> useful messaging projects out there.  In a world of cruft, Kafka is
> cool!
>
> * This is because both projects come at messaging from a specific
> point of view that is *different* from Rabbit.  OTOH, many other
> projects exist that replicate Rabbit features for fun, or NIH, or due
> to misunderstanding the semantics (yes, our docs could be better)
>
> * It is striking how few people describe those differences.  In a
> nutshell they are as follows:
>
> *** Kafka writes all incoming data to disk immediately, and then
> figures out who sees what.  So it is much more like a database than
> Rabbit, in that new consumers can appear well after the disk write and
> still subscribe to past messages.  Instead, Rabbit which tries to
> deliver to consumers and buffers otherwise.  Persistence is optional
> but robust and a feature of the buffer ("queue") not the upstream
> machinery.  Rabbit is able to cache-on-arrival via a plugin, but this
> is a design overlay and not particularly optimal.
>
> *** Kafka is a client server system with end to end semantics.  It
> defines order to include processing order, and keeps state on the
> client to do this.  Group management is via a 3rd party service
> (Zookeeper? I forget which).  Rabbit is a server-only protocol based
> system which maintains order on the server and through completely
> language neutral protocol semantics.  This makes Rabbit perhaps more
> natural as a 'messaging service' eg for integration and other
> inter-app data transfer.
>
> *** Rabbit is a general purpose messaging system with extras like
> federation.  It speaks many protocols, and has core features like HA,
> transactions, management, etc.  Everything can be switched on or off.
> Getting all this to work while keeping the install light and fast, is
> quite fiddly.  Kafka by contrast comes from a specific set of use
> cases, which are interesting certainly.  I am not sure if Kafka wants
> to be a general purpose messaging system, but it will become a bit
> more like Rabbit if that is the goal.
>
> *** Both approaches have costs.  In the case of Rabbit the cost is
> that more metadata is stored on the broker.  Kafka can get performance
> gains by storing less such data.  But we are talking about some N
> thousands of MPS versus some M thousands.  At those speeds the clients
> are usually the bottleneck anyway.
>
> * Let me also clarify some things:
>
> *** Rabbit does NOT store multiple copies of the same message across
> queues, unless they are very small (<60b, iirc).  A message delivered
> to >1 queue on 1 machine is stored once.  Metadata about that message
> may be stored more than once, but, at scale, the big cost is the
> payload.
>
> *** Rabbit's vanilla install does store some index data in memory when
> messages flow to disk.  You can change this by using a plugin, but
> this is a secret-menu undocumented feature.  Very very few people need