Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - fidelity of offsets when mirroring


Copy link to this message
-
Re: fidelity of offsets when mirroring
Martin Kleppmann 2014-03-06, 12:29
If you really don't mind some messages being lost during failover, your simplest option would be to just restart consumers at the latest offset in the new AZ. Or, if you don't mind messages being duplicated, rewind to an earlier time t as explained by Jun and Neha.

Another thought: you might be able to provide stronger guarantees at an application level. For example, you could include a unique identifier within every message, and use that to detect and discard duplicate messages after failover. However, keeping track of all those message IDs might require too much state (and that state would also have to be replicated across AZs). If you're doing offline processing of the data, eg importing it into Hadoop, then de-duplicating by message ID might be feasible. Just an idea.

Martin

On 5 Mar 2014, at 17:30, Neha Narkhede <[EMAIL PROTECTED]> wrote: