Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Persistence guarantees with Kafka 0.8


Copy link to this message
-
Re: Persistence guarantees with Kafka 0.8
Joel Koshy 2012-09-18, 23:51
There are time/size based settings for flush so it is possible for messages
to be acknowledged before the messages are flushed. However, even if a
leader fails while a flush is pending, and after the high-watermark
surpasses that offset, a new leader from the ISR will get elected and that
offset will remain available. Message loss is possible only if there are
correlated failures across all brokers in the ISR - i.e., if all brokers
fail before the messages are flushed to disk.

Thanks,

Joel

On Tue, Sep 18, 2012 at 3:58 PM, Rohit Prasad <[EMAIL PROTECTED]>wrote:

> Thanks Joel for the reply.
>
> Do you know if the message is committed (synced to disk) on the replicas
> before they respond to the leader? If it is not synced then there is a
> window of vulnerability. I remember reading somewhere that the replica
> commits it in memory but not to disk immediately to get performance.
>
> Thanks,
> Rohit
>
> On Tue, Sep 18, 2012 at 1:33 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
>
> > Rohit,
> >
> > The producer can specify the number of acks required - if it is set to
> the
> > replication factor, then the guarantee is that an ack will be sent only
> > after the message has been committed (i.e., when all followers have
> > received the message). If the required acks < replication factor then it
> is
> > possible for the message to be acknowledged before a leader failure and
> the
> > message will be "lost".
> >
> > Thanks,
> >
> > Joel
> >
> > On Tue, Sep 18, 2012 at 12:40 PM, Rohit Prasad <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Hi,
> > > I have gone through the replication documentation of 0.8, but have gone
> > > though the code. It seems that Kafka 0.8 is willing to take some
> message
> > > loss (when master fails while replicas are not in sync, and when log
> > buffer
> > > is not yet persisted) to trade for good performance. It implies that
> > > Producers will think that a message has been committed, but there is no
> > > strong guarantee that it actually has. Does it mean that kafka should
> not
> > > be used in use-cases where message loss can not be tolerated? Please
> > > correct me if my conclusion after reading the docs (and psuedocode) is
> > > wrong.
> > >
> > > Thanks,
> > > Rohit
> > >
> >
>