Yeah I agree, this is a problem.
The issue is that a produce request which is either in the network buffer
or in the request processing queue on the broker may still be processed
after a disconnect. So there is a race condition between that processing
and the reconnect/retry logic. You could work around this in a hacky way
using the reconnect backoff time, but the fundamental race condition
exists. We could easily make this more transparent by having some mode
where disconnection throws an error back to the client, but in fact there
is no way for the client to solve this either.
Neither Storm nor Samza nor any other framework would actually fix this
issue for you, since they are in turn dependent on Kafka's ordering (though
they might solve a lot of other problems).
As Jun mentions we have been thinking of having a per-producer sequence
number to enforce ordering. This would allow us to make produce calls
idempotent, enforce strong ordering in the case of retries, as well as fix
a number of other corner cases. I think it would handle this issue as well.
But it's not a quick patch.
I will try to get a design proposal up by next week so we have something
concrete to discuss.
On Thu, Aug 22, 2013 at 9:32 PM, Ross Black <[EMAIL PROTECTED]> wrote: