Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Arguments for Kafka over RabbitMQ ?


+
Dragos Manolescu 2013-06-06, 18:41
+
Jonathan Hodges 2013-06-06, 19:29
+
Marc Labbe 2013-06-07, 01:09
+
Alexis Richardson 2013-06-07, 12:54
+
Marc Labbe 2013-06-07, 13:31
Copy link to this message
-
Re: Arguments for Kafka over RabbitMQ ?
Hi, Alex,

Thanks for sharing your thoughts here. Your understand of Kafka is correct
and your analysis is very helpful. Just a couple of followup questions on
RabbitMQ.

1. How do people typically scale out RabbitMQ? In Kafka, we have this
notion of a cluster that can include multiple brokers. Brokers in a cluster
are managed through Zookeeper.
2. How does RabbitMQ support HA? In the upcoming Kafka 0.8 release, we are
adding the capability of storing a message redundantly in multiple brokers
in a cluster for achieving both higher durability and availability. I am
wondering if RabbitMQ is doing something similar already.

Jun

On Fri, Jun 7, 2013 at 5:54 AM, Alexis Richardson <[EMAIL PROTECTED]>wrote:

> Hi
>
> Alexis from Rabbit here.  I hope I am not intruding!
>
> It would be super helpful if people with questions, observations or
> moans posted them to the rabbitmq list too :-)
>
> A few comments:
>
> * Along with ZeroMQ, I consider Kafka to be one of the interesting and
> useful messaging projects out there.  In a world of cruft, Kafka is
> cool!
>
> * This is because both projects come at messaging from a specific
> point of view that is *different* from Rabbit.  OTOH, many other
> projects exist that replicate Rabbit features for fun, or NIH, or due
> to misunderstanding the semantics (yes, our docs could be better)
>
> * It is striking how few people describe those differences.  In a
> nutshell they are as follows:
>
> *** Kafka writes all incoming data to disk immediately, and then
> figures out who sees what.  So it is much more like a database than
> Rabbit, in that new consumers can appear well after the disk write and
> still subscribe to past messages.  Instead, Rabbit which tries to
> deliver to consumers and buffers otherwise.  Persistence is optional
> but robust and a feature of the buffer ("queue") not the upstream
> machinery.  Rabbit is able to cache-on-arrival via a plugin, but this
> is a design overlay and not particularly optimal.
>
> *** Kafka is a client server system with end to end semantics.  It
> defines order to include processing order, and keeps state on the
> client to do this.  Group management is via a 3rd party service
> (Zookeeper? I forget which).  Rabbit is a server-only protocol based
> system which maintains order on the server and through completely
> language neutral protocol semantics.  This makes Rabbit perhaps more
> natural as a 'messaging service' eg for integration and other
> inter-app data transfer.
>
> *** Rabbit is a general purpose messaging system with extras like
> federation.  It speaks many protocols, and has core features like HA,
> transactions, management, etc.  Everything can be switched on or off.
> Getting all this to work while keeping the install light and fast, is
> quite fiddly.  Kafka by contrast comes from a specific set of use
> cases, which are interesting certainly.  I am not sure if Kafka wants
> to be a general purpose messaging system, but it will become a bit
> more like Rabbit if that is the goal.
>
> *** Both approaches have costs.  In the case of Rabbit the cost is
> that more metadata is stored on the broker.  Kafka can get performance
> gains by storing less such data.  But we are talking about some N
> thousands of MPS versus some M thousands.  At those speeds the clients
> are usually the bottleneck anyway.
>
> * Let me also clarify some things:
>
> *** Rabbit does NOT store multiple copies of the same message across
> queues, unless they are very small (<60b, iirc).  A message delivered
> to >1 queue on 1 machine is stored once.  Metadata about that message
> may be stored more than once, but, at scale, the big cost is the
> payload.
>
> *** Rabbit's vanilla install does store some index data in memory when
> messages flow to disk.  You can change this by using a plugin, but
> this is a secret-menu undocumented feature.  Very very few people need
> any such thing.
>
> *** A Rabbit queue is lightweight.  It's just an ordered consumption
> buffer that can persist and ack.  Don't assume things about Rabbit

 
+
Alexis Richardson 2013-06-07, 15:58
+
Jonathan Hodges 2013-06-07, 18:04
+
Jonathan Hodges 2013-06-07, 18:42
+
Alexis Richardson 2013-06-07, 22:49
+
Alexis Richardson 2013-06-07, 22:41
+
Jonathan Hodges 2013-06-08, 01:09
+
Alexis Richardson 2013-06-08, 08:08
+
Jonathan Hodges 2013-06-08, 11:53
+
Alexis Richardson 2013-06-08, 20:09
+
Jonathan Hodges 2013-06-08, 23:03
+
Mark 2013-06-09, 15:59
+
Jonathan Hodges 2013-06-10, 12:13
+
Tim Watson 2013-06-10, 12:40
+
Jonathan Hodges 2013-06-10, 13:19
+
Tim Watson 2013-06-11, 14:20
+
Jonathan Hodges 2013-06-11, 16:50
+
Alexis Richardson 2013-06-13, 14:45
+
Jonathan Hodges 2013-06-13, 15:23
+
Alexis Richardson 2013-06-08, 20:20
+
Alexis Richardson 2013-06-08, 21:27
+
Alexis Richardson 2013-06-07, 13:31
+
Dragos Manolescu 2013-06-07, 20:52
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB