Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Replication questions


Copy link to this message
-
Re: Replication questions
Ah, gotcha, so my usage of the term "in-memory replication" can be
misleading: Kafka still doesn't retain the data in-app (i.e.: in Kafka's
allocated memory), but the data is in-memory nonetheless because of the OS'
file system cache.

Basically, on the individual node's level, this is not different from what
we already have (without KAFKA-50), but the fact that KAFKA-50 will give us
replication means that the data will reside in the OS' file system cache of
many nodes, giving us much more reliable durability guarantees.

Thanks for the nitty gritty details Jay :)

--
Felix

On Tue, May 1, 2012 at 1:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> Yes, that is correct. Technically we always immediately write to the
> filesystem, it is just a question of when you fsync the file (that is
> the slow thing). So though it is in memory it is not in application
> memory, so it always survives kill -9 but not unplugging the machine.
> Currently when a broker fails messages that are flushed to disk come
> back if the broker comes back with an intact filesystem (if the broker
> fs is destroyed then it is lost). With replication we retain this same
> flexibility on the flush policy, so you can flush every message to
> disk immediately if you like, however having the message on multiple
> machines is in some ways better durability then the fsync gives, as
> the message will survive destruction of the filesystem, so we think
> you can legitimately allow consumers to consume messages independent
> of the flush policy.
>
> Also when a broker fails it will lose unflushed messages, however when
> it comes back to life it will restore these messages from the other
> replicas before it will serve data to consumers. So the log will be
> byte-for-byte identical across all servers including both the contents
> and the ordering of messages.
>
> -Jay
>
> On Tue, May 1, 2012 at 9:24 AM, Felix GV <[EMAIL PROTECTED]> wrote:
> > Hmm... interesting!
> >
> > So, if I understanding correctly, what you're saying regarding point 2,
> is
> > that the messages are going to be kept in memory on several nodes, and
> > start being served to consumers as soon as this is completed, rather than
> > after the data is flushed to disk? This way, we still benefit from the
> > throughput gain of flushing data to disk in batches, but we consider that
> > the added durability of having in-memory replication is good enough to
> > start serving that data to consumers sooner.
> >
> > Furthermore, this means that in the unlikely event that several nodes
> would
> > fail simultaneously (a correlated failure), the data that is replicated
> to
> > the failed nodes but not yet flushed on any of them would be lost.
> However,
> > when a single node crashes and is then restarted, only the failed node
> will
> > have lost its unflushed data, while the other nodes that had replicated
> > that data will have had the opportunity to flush it to disk later on.
> >
> > Sorry if I'm repeating like a parrot. I just want to make sure I
> understand
> > correctly :)
> >
> > Please correct me if I'm not interpreting this correctly!
> >
> > --
> > Felix
> >
> >
> >
> > On Mon, Apr 30, 2012 at 5:59 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> >
> >> Yes, it is also worth noting that there are couple of different ways
> >> to think about latency:
> >> 1. latency of the request from the producer's point-of-view
> >> 2. end-to-end latency to the consumer
> >>
> >> As Jun mentions (1) may go up a little because the producer was
> >> sending data without checking for any answer from the server. Although
> >> this gives a nice buffering effect it leads to a number of corner
> >> cases that are hard to deal with correctly. It should be the case that
> >> setting the consumer to async has the same effect from the producer
> >> point of view without the corner cases of having no RPC response to
> >> convey errors and other broker misbehavior.
> >>
> >> (2) May actually get significantly better, especially for lower volume