Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> too many duplicate messages


Copy link to this message
-
Re: too many duplicate messages
With compression enabled (as you have) it is possible for a consumer to see
duplicates during rebalance. This is because iteration may be in the middle
of a compressed message set just before a rebalance, but the checkpointed
offsets are at MessageSet boundaries. However, this would only be during
rebalance - i.e., in steady state, when you have no change in # consumers/#
partitions you shouldn't see duplicates.

Joel
On Fri, Nov 16, 2012 at 10:02 AM, Mark Grabois <[EMAIL PROTECTED]>wrote:

> https://gist.github.com/4089354.git
>
> https://gist.github.com/4089369.git
>
>
>
> On Fri, Nov 16, 2012 at 12:59 PM, Mark Grabois <[EMAIL PROTECTED]
> >wrote:
>
> > Hi all,
> >
> > I'm encountering a problem where i'm getting far too many duplicate
> > messages being sent to my kafka setup (statically, using broker.list),
> > being picked up by zk-based consumers.
> >
> > I've provided my test classes here and the kafka/zk versions i'm using to
> > run them and my servers:
> >
> > *client side*:
> > producer: git://gist.github.com/4089354.git
> > consumer: git://gist.github.com/4089369.git
> > *jars*:
> > kafka-0.7.2
> >
> > *server side*:
> > 5 kafka servers, zk servers on 3 of those
> > 1 partition per test topic per server
> > *server versions*:
> > kafka-0.7.1
> > zookeeper-3.4.3
> >
> > Any advice would be greatly appreciated.
> >
> > Thank you,
> > Mark
> >
> >
> >
>