Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - Re: is it possible to commit offsets on a per stream basis?


Copy link to this message
-
Re: is it possible to commit offsets on a per stream basis?
Neha Narkhede 2013-09-07, 19:09
>> Can I create multiple connectors, and have each use the same Regex
for the TopicFilter?  Will each connector share the set of available
topics?  Is this safe to do?

>> Or is it necessary to create mutually non-intersecting regex's for each
connector?

As long as each of those consumer connectors share the same group id, Kafka
consumer rebalancing should automatically re-distribute the
topic/partitions amongst the consumer connectors/streams evenly.

Thanks,
Neha
On Mon, Sep 2, 2013 at 1:35 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Will this work if we are using a TopicFilter, that can map to multiple
> topics.  Can I create multiple connectors, and have each use the same Regex
> for the TopicFilter?  Will each connector share the set of available
> topics?  Is this safe to do?
>
> Or is it necessary to create mutually non-intersecting regex's for each
> connector?
>
> It seems I have a similar issue.  I have been using auto commit mode, but
> it doesn't guarantee that all messages committed have been successfully
> processed (seems a change to the connector itself might expose a way to use
> auto offset commit, and have it never commit a message until it is
> processed).  But that would be a change to the
> ZookeeperConsumerConnector....Essentially, it would be great if after
> processing each message, we could mark the message as 'processed', and thus
> use that status as the max offset to commit when the auto offset commit
> background thread wakes up each time.
>
> Jason
>
>
> On Thu, Aug 29, 2013 at 11:58 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
>
> > Thanks, Neha. That is a great answer.
> >
> > Regards,
> >
> > Libo
> >
> >
> > -----Original Message-----
> > From: Neha Narkhede [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, August 29, 2013 1:55 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: is it possible to commit offsets on a per stream basis?
> >
> > 1 We can create multiple connectors. From each connector create only one
> > stream.
> > 2 Use a single thread for a stream. In this case, the connector in each
> > thread can commit freely without any dependence on the other threads.  Is
> > this the right way to go? Will it introduce any dead lock when multiple
> > connectors commit at the same time?
> >
> > This is a better approach as there is no complex locking involved.
> >
> > Thanks,
> > Neha
> >
> >
> > On Thu, Aug 29, 2013 at 10:28 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> >
> > > Hi team,
> > >
> > > This is our current use case:
> > > Assume there is a topic with multiple partitions.
> > > 1 Create a connector first and create multiple streams from the
> > > connector for a topic.
> > > 2 Create multiple threads, one for each stream. You can assume the
> > > thread's job is to save the message into the database.
> > > 3 When it is time to commit offsets, all threads have to synchronize
> > > on a barrier before committing the offsets. This is to ensure no
> > > message loss in case of process crash.
> > >
> > > As all threads need to synchronize before committing, it is not
> > efficient.
> > > This is a workaround:
> > >
> > > 1 We can create multiple connectors. From each connector create only
> > > one stream.
> > > 2 Use a single thread for a stream. In this case, the connector in
> > > each thread can commit freely without any dependence on the other
> > > threads.  Is this the right way to go? Will it introduce any dead lock
> > > when multiple connectors commit at the same time?
> > >
> > > It would be great to allow committing on a per stream basis.
> > >
> > > Regards,
> > >
> > > Libo
> > >
> > >
> >
>