Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Unpacking "strong durability and fault-tolerance guarantees"


+
David James 2013-07-07, 15:36
Copy link to this message
-
Re: Unpacking "strong durability and fault-tolerance guarantees"
Disclaimer: I only have experience right now with 0.7.2

On Sun, Jul 7, 2013 at 11:35 AM, David James <[EMAIL PROTECTED]> wrote:
> Sorry for the long email, but I've tried to keep it organized, at least.
>
> "Kafka has a modern cluster-centric design that offers strong
> durability and fault-tolerance guarantees." and "Messages are
> persisted on disk and replicated within the cluster to prevent data
> loss." according to http://kafka.apache.org/.
>
> I'm trying to understand what this means in some detail. So, two questions.
>
> 1. Fault-Tolerance
>
> If a Broker in a Kafka cluster fails (the EC2 instance dies), what
> happens? After, let's say I add a new Broker to the cluster (that my
> responsibility, not Kafka's). What happens when it rejoins?
>
> To be more particular, if the cluster consists of a Zookeeper and B
> (3, for example) Brokers, can a Kafka system guarantee to tolerate up
> to B-1 (2, for example) Broker failures?
>
> 2. Durability at an application level
>
> What are the guarantees about durability, at an application level, in
> practice? By "application level" I mean guarantees that a produced
> message gets consumed and acted upon by an application that uses
> Kafka. My understanding at present is that Kafka does not make these
> kinds of guarantees because there are no acks. So, it is up to the
> application developer to handle it. Is this right?

Yes.

>
> Here's my understanding: Having messages persisted on disk and
> replicated is why Kafka has durability guarantees. But, from an
> application perspective, what happens when a consumer pulls a message
> but fails before acting on it? That would update the Kafka consumer
> offset, right? So, without some thinking and planning ahead on the
> Kafka system design, the application's consumers would not have a way
> of knowing that a message was not actually processed.

Don't update the offset until message has been fully processed. This
means you need to build downstream systems that can accept messages
being replayed, since a message may be processed but the consumer
crash before the offset it updated. Or at least have a process in
place to deal with clean-up, in the event of a crash.

>
> Conclusion / Last Question
>
> I'm interested in making the chance of message loss minimal, at a
> system level. Any pointers on what to read or think about would be
> appreciated!
>
> Thanks!
> -David

 
+
Jay Kreps 2013-07-07, 21:36
+
Florin Trofin 2013-07-13, 01:39
+
Eric Sites 2013-07-13, 02:58
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB