Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: Duplicate records in Kafka 0.7


Copy link to this message
-
Re: Duplicate records in Kafka 0.7
You mean duplicate records on the consumer side? Duplicates are
possible if there are consumer failures and a another consumer
instance resumes from an earlier offset. It is also possible if there
are producer retries due to exceptions while producing. Do you see any
of these errors in your logs? Besides these scenarios though, you
shouldn't be seeing duplicates.

Thanks,

Joel
On Wed, Jan 8, 2014 at 5:21 PM, Xuyen On <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I would like to check to see if other people are seeing duplicate records with Kafka 0.7. I read the Jira's and I believe that duplicates are still possible when using message compression on Kafka 0.7. I'm seeing duplicate records from the range of 6-13%. Is this normal?
>
> If you're using Kafka 0.7 with message compression enabled, can you please let me know any duplicate records and if so, what %?
>
> Also, please let me know what sort of deduplication strategy you're using.
>
> Thanks!
>
>