Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Arguments for Kafka over RabbitMQ ?


Copy link to this message
-
Re: Arguments for Kafka over RabbitMQ ?
Hi Alexis,

This was very helpful and I also appreciate both yours and Tim's input
here.  It clears up the cases for when to use Rabbit or Kafka.  What is
great is they are both open source with vibrant communities behind them.

-Jonathan

Go
On Jun 13, 2013 8:45 AM, "Alexis Richardson" <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> First, thanks to Tim (from Rabbit) and Jonathan for moving this thread
> along.  Jonathan, I hope you found my links to the data model docs,
> and Tim's replies, helpful.
>
> Has everyone got what they wanted from this thread?
>
> alexis
>
>
> On Tue, Jun 11, 2013 at 5:49 PM, Jonathan Hodges <[EMAIL PROTECTED]>
> wrote:
> > Hi Tim,
> >
> > While your comments regarding durability are accurate for 0.7 version of
> > Kafka, it is a bit greyer with 0.8.  In 0.8 you have the ability to
> > configure Kafka to have the durability you need.  This is what I was
> > referring to with the link to Jun’s ApacheCon slides (
> > http://www.slideshare.net/junrao/kafka-replication-apachecon2013).
> >
> > If you look at slide 21 titled, ‘Data Flow in Replication’ you see the
> > three possible durability configurations which tradeoff latency for
> greater
> > persistence guarantees.
> >
> > The third row is the ‘no data loss’ configuration option where the
> producer
> > only receives an ack from the broker once the message(s) are committed by
> > the leader and peers (mirrors as you call them) and flushed to disk.
>  This
> > seems to be very similar to the scenario you describe in Rabbit, no?
> >
> > Jun or Neha can you please confirm my understanding of 0.8 durability is
> > correct and there is no data loss in the scenario I describe?  I know
> there
> > is a separate configuration setting, log.flush.interval.messages, but I
> > thought in sync mode the producer doesn’t receive an ack until message(s)
> > are committed and flushed to disk.  Please correct me if my understanding
> > is incorrect.
> >
> > Thanks!
> >
> >
> > On Tue, Jun 11, 2013 at 8:20 AM, Tim Watson <[EMAIL PROTECTED]
> >wrote:
> >
> >> Hi Jonathan,
> >>
> >> So, thanks for replying - that's all useful info.
> >>
> >> On 10 Jun 2013, at 14:19, Jonathan Hodges wrote:
> >> > Kafka has a configurable rolling window of time it keeps the messages
> per
> >> > topic.  The default is 7 days and after this time the messages are
> >> removed
> >> > from disk by the broker.
> >> > Correct, the consumers maintain their own state via what are known as
> >> > offsets.  Also true that when producers/consumers contact the broker
> >> there
> >> > is a random seek to the start of the offset, but the majority of
> access
> >> > patterns are linear.
> >> >
> >>
> >> So, just to be clear, the distinction that has been raised on this
> thread
> >> is only part of the story, viz the difference in rates between RabbitMQ
> and
> >> Kafka. Essentially, these two systems are performing completely
> different
> >> tasks, since in RabbitMQ, the concept of a long-term persistent topic
> whose
> >> entries are removed solely based on expiration policy is somewhat alien.
> >> RabbitMQ will delete messages from its message store as soon as a
> relevant
> >> consumer has seen and ACK'ed them, which *requires* tracking consumer
> state
> >> in the broker. I suspect this was your (earlier) point about Kafka /not/
> >> trying to be a general purpose message broker, but having an
> architecture
> >> that is highly tuned to a specific set of usage patterns.
> >>
> >> >> As you can see in the last graph of 10 million messages which is less
> >> than
> >> >> a GB on disk, the Rabbit throughput is capped around 10k/sec.  Beyond
> >> >> throughput, with the pending release of 0.8, Kafka will also have
> >> >> advantages around message guarantees and durability.
> >> >>
> >> >
> >> [snip]
> >> > Correct with 0.8 Kafka will have similar options like Rabbit fsync
> >> > configuration option.
> >>
> >> Right, but just to be clear, unless Kafka starts to fsync for every
> single
> >> published message, you are /not/ going to offer the same guarantee. In

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB