Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> too many duplicate messages


Copy link to this message
-
Re: too many duplicate messages
How would you potentially discover a duplicate message, I guess your
message has a id/guid?
On Fri, Nov 16, 2012 at 8:50 PM, Joel Koshy <[EMAIL PROTECTED]> wrote:

> With compression enabled (as you have) it is possible for a consumer to see
> duplicates during rebalance. This is because iteration may be in the middle
> of a compressed message set just before a rebalance, but the checkpointed
> offsets are at MessageSet boundaries. However, this would only be during
> rebalance - i.e., in steady state, when you have no change in # consumers/#
> partitions you shouldn't see duplicates.
>
> Joel
>
>
> On Fri, Nov 16, 2012 at 10:02 AM, Mark Grabois <[EMAIL PROTECTED]
> >wrote:
>
> > https://gist.github.com/4089354.git
> >
> > https://gist.github.com/4089369.git
> >
> >
> >
> > On Fri, Nov 16, 2012 at 12:59 PM, Mark Grabois <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Hi all,
> > >
> > > I'm encountering a problem where i'm getting far too many duplicate
> > > messages being sent to my kafka setup (statically, using broker.list),
> > > being picked up by zk-based consumers.
> > >
> > > I've provided my test classes here and the kafka/zk versions i'm using
> to
> > > run them and my servers:
> > >
> > > *client side*:
> > > producer: git://gist.github.com/4089354.git
> > > consumer: git://gist.github.com/4089369.git
> > > *jars*:
> > > kafka-0.7.2
> > >
> > > *server side*:
> > > 5 kafka servers, zk servers on 3 of those
> > > 1 partition per test topic per server
> > > *server versions*:
> > > kafka-0.7.1
> > > zookeeper-3.4.3
> > >
> > > Any advice would be greatly appreciated.
> > >
> > > Thank you,
> > > Mark
> > >
> > >
> > >
> >
>