|
|
-
Persistence guarantees with Kafka 0.8
Rohit Prasad 2012-09-18, 19:40
Hi, I have gone through the replication documentation of 0.8, but have gone though the code. It seems that Kafka 0.8 is willing to take some message loss (when master fails while replicas are not in sync, and when log buffer is not yet persisted) to trade for good performance. It implies that Producers will think that a message has been committed, but there is no strong guarantee that it actually has. Does it mean that kafka should not be used in use-cases where message loss can not be tolerated? Please correct me if my conclusion after reading the docs (and psuedocode) is wrong.
Thanks, Rohit
-
Re: Persistence guarantees with Kafka 0.8
Joel Koshy 2012-09-18, 20:33
Rohit,
The producer can specify the number of acks required - if it is set to the replication factor, then the guarantee is that an ack will be sent only after the message has been committed (i.e., when all followers have received the message). If the required acks < replication factor then it is possible for the message to be acknowledged before a leader failure and the message will be "lost".
Thanks,
Joel
On Tue, Sep 18, 2012 at 12:40 PM, Rohit Prasad <[EMAIL PROTECTED]>wrote:
> Hi, > I have gone through the replication documentation of 0.8, but have gone > though the code. It seems that Kafka 0.8 is willing to take some message > loss (when master fails while replicas are not in sync, and when log buffer > is not yet persisted) to trade for good performance. It implies that > Producers will think that a message has been committed, but there is no > strong guarantee that it actually has. Does it mean that kafka should not > be used in use-cases where message loss can not be tolerated? Please > correct me if my conclusion after reading the docs (and psuedocode) is > wrong. > > Thanks, > Rohit >
-
Re: Persistence guarantees with Kafka 0.8
Rohit Prasad 2012-09-18, 22:58
Thanks Joel for the reply.
Do you know if the message is committed (synced to disk) on the replicas before they respond to the leader? If it is not synced then there is a window of vulnerability. I remember reading somewhere that the replica commits it in memory but not to disk immediately to get performance.
Thanks, Rohit
On Tue, Sep 18, 2012 at 1:33 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:
> Rohit, > > The producer can specify the number of acks required - if it is set to the > replication factor, then the guarantee is that an ack will be sent only > after the message has been committed (i.e., when all followers have > received the message). If the required acks < replication factor then it is > possible for the message to be acknowledged before a leader failure and the > message will be "lost". > > Thanks, > > Joel > > On Tue, Sep 18, 2012 at 12:40 PM, Rohit Prasad <[EMAIL PROTECTED] > >wrote: > > > Hi, > > I have gone through the replication documentation of 0.8, but have gone > > though the code. It seems that Kafka 0.8 is willing to take some message > > loss (when master fails while replicas are not in sync, and when log > buffer > > is not yet persisted) to trade for good performance. It implies that > > Producers will think that a message has been committed, but there is no > > strong guarantee that it actually has. Does it mean that kafka should not > > be used in use-cases where message loss can not be tolerated? Please > > correct me if my conclusion after reading the docs (and psuedocode) is > > wrong. > > > > Thanks, > > Rohit > > >
-
Re: Persistence guarantees with Kafka 0.8
Joel Koshy 2012-09-18, 23:51
There are time/size based settings for flush so it is possible for messages to be acknowledged before the messages are flushed. However, even if a leader fails while a flush is pending, and after the high-watermark surpasses that offset, a new leader from the ISR will get elected and that offset will remain available. Message loss is possible only if there are correlated failures across all brokers in the ISR - i.e., if all brokers fail before the messages are flushed to disk.
Thanks,
Joel
On Tue, Sep 18, 2012 at 3:58 PM, Rohit Prasad <[EMAIL PROTECTED]>wrote:
> Thanks Joel for the reply. > > Do you know if the message is committed (synced to disk) on the replicas > before they respond to the leader? If it is not synced then there is a > window of vulnerability. I remember reading somewhere that the replica > commits it in memory but not to disk immediately to get performance. > > Thanks, > Rohit > > On Tue, Sep 18, 2012 at 1:33 PM, Joel Koshy <[EMAIL PROTECTED]> wrote: > > > Rohit, > > > > The producer can specify the number of acks required - if it is set to > the > > replication factor, then the guarantee is that an ack will be sent only > > after the message has been committed (i.e., when all followers have > > received the message). If the required acks < replication factor then it > is > > possible for the message to be acknowledged before a leader failure and > the > > message will be "lost". > > > > Thanks, > > > > Joel > > > > On Tue, Sep 18, 2012 at 12:40 PM, Rohit Prasad <[EMAIL PROTECTED] > > >wrote: > > > > > Hi, > > > I have gone through the replication documentation of 0.8, but have gone > > > though the code. It seems that Kafka 0.8 is willing to take some > message > > > loss (when master fails while replicas are not in sync, and when log > > buffer > > > is not yet persisted) to trade for good performance. It implies that > > > Producers will think that a message has been committed, but there is no > > > strong guarantee that it actually has. Does it mean that kafka should > not > > > be used in use-cases where message loss can not be tolerated? Please > > > correct me if my conclusion after reading the docs (and psuedocode) is > > > wrong. > > > > > > Thanks, > > > Rohit > > > > > >
|
|