Yeah I'm not sure how good our understanding was when we wrote that.
Here is my take now:
At least once delivery is not that hard but you need the ability to
deduplicate things--basically you turn the "at least once delivery channel"
into the "exactly once channel" by throwing away duplicates. This means
1. Some key assigned by the producer that allows the broker to detect a
re-published message to make publishing idempotent. This solves the problem
of producer retries. This key obviously has to be highly available--i.e. if
the leader for a partition fails the follower must correctly deduplicate
for all committed messages.
2. Some key that allows the consumer to detect a re-consumed message.
The first item is actually pretty doable as we can track some producer
sequence in the log and use it to avoid duplicate appends. We just need to
implement it. I think this can be done in a way that is fairly low overhead
and can be "on by default".
We actually already provide such a key to the consumer--the offset. Making
use of this is actually somewhat application dependent. Obviously providing
exactly-once guarantees in the case of no failures is easy and we already
handle that case. The harder part is if a consumer process dies to ensure
that it restarts in a position that exactly matches the state changes that
it has made in some destination system. If the consumer application uses
the offset in a way that makes updates idempotent that will work, or if
they commit their offset and data atomically that works. However in general
the goal of a consumer is to produce some state change in another system (a
db, hdfs, some other data system, etc) and having a general solution that
works with all of these is hard since they have very different limitations
On Wed, Aug 7, 2013 at 4:00 PM, Yang <[EMAIL PROTECTED]> wrote:
> I wonder why at-least-once guarantee is easier to maintain than
> exactly-once (in that the latter requires 2PC while the former does not ,
> according to
> if u achieve at-least-once guarantee, you are able to assert between 2
> cases "nothing" vs ">=1 delivered", which can be seen as 2 different
> answers 0 and 1. isn't this as hard as the common Byzantine general