Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Arguments for Kafka over RabbitMQ ?


Copy link to this message
-
Re: Arguments for Kafka over RabbitMQ ?
We just went through a similar exercise with RabbitMQ at our company with
streaming activity data from our various web properties.  Our use case
requires consumption of this stream by many heterogeneous consumers
including batch (Hadoop) and real-time (Storm).  We pointed out that Kafka
acts as a configurable rolling window of time on the activity stream.  The
window default is 7 days which allows for supporting clients of different
latencies like Hadoop and Storm to read from the same stream.

We pointed out that the Kafka brokers don't need to maintain consumer state
in the stream and only have to maintain one copy of the stream to support N
number of consumers.  Rabbit brokers on the other hand have to maintain the
state of each consumer as well as create a copy of the stream for each
consumer.  In our scenario we have 10-20 consumers and with the scale and
throughput of the activity stream we were able to show Rabbit quickly
becomes the bottleneck under load.

On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu <
[EMAIL PROTECTED]> wrote:

> Hi --
>
> I am preparing to make a case for using Kafka instead of Rabbit MQ as a
> broker-based messaging provider. The context is similar to that of the
> Kafka papers and user stories: the producers publish monitoring data and
> logs, and a suite of subscribers consume this data (some store it, others
> perform computations on the event stream). The requirements are typical of
> this context: low-latency, high-throughput, ability to deal with bursts and
> operate in/across multiple data centers, etc.
>
> I am familiar with the performance comparison between Kafka, Rabbit MQ and
> Active MQ from the NetDB 2011 paper<
> http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf>.
> However in the two years that passed since then the number of production
> Kafka installations increased, and people are using it in different ways
> than those imagined by Kafka's designers. In light of these experiences one
> can use more data points and color when contrasting to Rabbit MQ (which by
> the way also evolved since 2011). (And FWIW I know I am not the first one
> to walk this path; see for example last year's OSCON session on the State
> of MQ<http://lanyrd.com/2012/oscon/swrcz/>.)
>
> I would appreciate it if you could share measurements, results, or even
> anecdotal evidence along these lines. How have you avoided the "let's use
> Rabbit MQ because everybody else does it" route when solving problems for
> which Kafka is a better fit?
>
> Thanks,
>
> -Dragos
>