Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> A question about possible race condition


Copy link to this message
-
Re: A question about possible race condition
Then, you may be interested in our consume client redesign:

https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design

Thanks,

Jun

On Fri, Nov 2, 2012 at 7:17 AM, Guy Peleg <[EMAIL PROTECTED]> wrote:

> Jun,
>
> On that note, waiting to consumer's acknowledgment can be with configurable
> timeout ranging from blocking to not wait at all
>
> Thanks,
>
> Guy
>
>
>
> On Fri, Nov 2, 2012 at 12:38 PM, Guy Peleg <[EMAIL PROTECTED]> wrote:
>
> > Jun,
> >
> > I'm not sure that's enough.
> >
> > A callback may not be enough since we can't be sure that there are no
> > events from that partition being processed while the new consumer starts
> > processing events from that partition.
> >
> > I think that a consumer should be handed a partition only after we're
> sure
> > there is no other consumer that is reading or *processing *events from
> > that partition.
> >
> > The only way to achieve that is, I think, by some kind of acknowledgment
> > from the consumer side that it is ready to give up the partition (e.g.
> > after gracefully stopped working internally with those events)
> >
> > I know that means that there is a need to consider the extreme cases
> here,
> > but still I think we can't do without the consumer's 'acknoledment'
> without
> > being subject to race scenarios.
> >
> > Thanks,
> >
> > Guy
> >
> >
> >
> >
> > On Thu, Nov 1, 2012 at 4:56 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >
> >> Guy,
> >>
> >> Yes, this is possible. One solution that we have been thinking about is
> >> that if a rebalance happens, each consumer can somehow get a callback
> that
> >> indicates the set of partitions being consumed may have changed. Will
> this
> >> address your concern?
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Thu, Nov 1, 2012 at 12:10 AM, Guy Peleg <[EMAIL PROTECTED]> wrote:
> >>
> >> > One more possible race might happen when the partition number is fixed
> >> but
> >> > consumer(s) are added/removed
> >> > For example: If I have a consumer reading data from two partitions
> >> > (partition one and partition two), and a new consumer is added, the
> >> result
> >> > will be that each consumer will consume from one partition
> >> > let's say that the 'old' consumer will continue with partition one
> while
> >> > the new consumer will process the data from partition two
> >> >
> >> > but, suppose that partition two held events that belong to event id
> 'x',
> >> > and that partition is now consumed by the new consumer,
> >> > Since consumers might reside on different machines and they are
> possibly
> >> > multithreaded processes, there might be a situation that other event
> ids
> >> > 'x' are already 'in the internal queues' and are being processed
> >> > by the first consumer (events that were read/entered the first
> consumer
> >> > before the new consumer appeared but are being processed or wait to
> >> > processed within the 'old' consumer) and that means that there is a
> >> > possibility that those events are being processed simultaneously by
> the
> >> two
> >> > consumers (since the new consumer will start reading events that might
> >> be
> >> > of id 'x' and that might be then processed in parallel with event ids
> >> 'x'
> >> > in the old consumer)
> >> >
> >> > If that is a possible scenario then when a new consumer is starting
> >> there
> >> > should be some kind of 'consumers sync'
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Wed, Oct 31, 2012 at 4:57 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >> >
> >> > > Guy,
> >> > >
> >> > > This is really an issue with changing # of partitions. If # of
> >> partitions
> >> > > changes for a topic, in the transition phase, messages used to be
> >> > delivered
> >> > > to the same partition could be delivered to different partitions and
> >> > their
> >> > > consumption ordering is non-deterministic (since ordered consumption
> >> is
> >> > > only guaranteed within a partition).
> >> > >
> >> > > In 0.7, # of partitions increases as new brokers are added. In 0.8,
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB