We just went through a similar exercise with RabbitMQ at our company with
streaming activity data from our various web properties. Our use case
requires consumption of this stream by many heterogeneous consumers
including batch (Hadoop) and real-time (Storm). We pointed out that Kafka
acts as a configurable rolling window of time on the activity stream. The
window default is 7 days which allows for supporting clients of different
latencies like Hadoop and Storm to read from the same stream.
We pointed out that the Kafka brokers don't need to maintain consumer state
in the stream and only have to maintain one copy of the stream to support N
number of consumers. Rabbit brokers on the other hand have to maintain the
state of each consumer as well as create a copy of the stream for each
consumer. In our scenario we have 10-20 consumers and with the scale and
throughput of the activity stream we were able to show Rabbit quickly
becomes the bottleneck under load.
On Thu, Jun 6, 2013 at 12:40 PM, Dragos Manolescu <
[EMAIL PROTECTED]> wrote: