Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # dev >> [jira] [Updated] (KAFKA-955) After a leader change, messages sent with ack=0 are lost

Copy link to this message
[jira] [Updated] (KAFKA-955) After a leader change, messages sent with ack=0 are lost

     [ https://issues.apache.org/jira/browse/KAFKA-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guozhang Wang updated KAFKA-955:

    Attachment: KAFKA-955.v7.patch

Thanks for the comments, Neha, Jay.


1. Done.
2. Incorporated with Jay's comments.


1. Done.
2. Done. I ended up using trait and objects, and let requestChannel.close and requestChannel.noOperation (I changed the name from noResponse here since it matches the noOpAction better) create new responses themselves.
3. Done.
4. Done.
5. Done.

Regarding your question, the message loss is depending on the producer throughput and queue size. Since the first message will always be silently dropped, and once the producer noticed the socket closure, it will stop producing and refresh new metadata, and if during this time the producer queue is full then it will drop more messages. So the answer would be the range between [1, produce-throughput / time-taken-to-refresh-metadata - queue-size].
> After a leader change, messages sent with ack=0 are lost
> --------------------------------------------------------
>                 Key: KAFKA-955
>                 URL: https://issues.apache.org/jira/browse/KAFKA-955
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Jason Rosenberg
>            Assignee: Guozhang Wang
>         Attachments: KAFKA-955.v1.patch, KAFKA-955.v1.patch, KAFKA-955.v2.patch, KAFKA-955.v3.patch, KAFKA-955.v4.patch, KAFKA-955.v5.patch, KAFKA-955.v6.patch, KAFKA-955.v7.patch
> If the leader changes for a partition, and a producer is sending messages with ack=0, then messages will be lost, since the producer has no active way of knowing that the leader has changed, until it's next metadata refresh update.
> The broker receiving the message, which is no longer the leader, logs a message like this:
> Produce request with correlation id 7136261 from client  on partition [mytopic,0] failed due to Leader not local for partition [mytopic,0] on broker 508818741
> This is exacerbated by the controlled shutdown mechanism, which forces an immediate leader change.
> A possible solution to this would be for a broker which receives a message, for a topic that it is no longer the leader for (and if the ack level is 0), then the broker could just silently forward the message over to the current leader.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira