Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> Re: is it possible to commit offsets on a per stream basis?


+
Neha Narkhede 2013-08-29, 17:55
+
Jason Rosenberg 2013-09-02, 20:36
+
Neha Narkhede 2013-09-07, 19:09
+
Jason Rosenberg 2013-09-08, 03:33
Copy link to this message
-
Re: is it possible to commit offsets on a per stream basis?
That should be fine too.
On Sat, Sep 7, 2013 at 8:33 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> To be clear, it looks like I forgot to add to my question, that I am asking
> about creating multiple connectors, within the same consumer process (as I
> realize I can obviously have multiple connectors running on multiple hosts,
> etc.).  But I'm guessing that should be fine too?
>
> Jason
>
>
>
>
> On Sat, Sep 7, 2013 at 3:09 PM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
>
> > >> Can I create multiple connectors, and have each use the same Regex
> > for the TopicFilter?  Will each connector share the set of available
> > topics?  Is this safe to do?
> >
> > >> Or is it necessary to create mutually non-intersecting regex's for
> each
> > connector?
> >
> > As long as each of those consumer connectors share the same group id,
> Kafka
> > consumer rebalancing should automatically re-distribute the
> > topic/partitions amongst the consumer connectors/streams evenly.
> >
> > Thanks,
> > Neha
> >
> >
> > On Mon, Sep 2, 2013 at 1:35 PM, Jason Rosenberg <[EMAIL PROTECTED]>
> wrote:
> >
> > > Will this work if we are using a TopicFilter, that can map to multiple
> > > topics.  Can I create multiple connectors, and have each use the same
> > Regex
> > > for the TopicFilter?  Will each connector share the set of available
> > > topics?  Is this safe to do?
> > >
> > > Or is it necessary to create mutually non-intersecting regex's for each
> > > connector?
> > >
> > > It seems I have a similar issue.  I have been using auto commit mode,
> but
> > > it doesn't guarantee that all messages committed have been successfully
> > > processed (seems a change to the connector itself might expose a way to
> > use
> > > auto offset commit, and have it never commit a message until it is
> > > processed).  But that would be a change to the
> > > ZookeeperConsumerConnector....Essentially, it would be great if after
> > > processing each message, we could mark the message as 'processed', and
> > thus
> > > use that status as the max offset to commit when the auto offset commit
> > > background thread wakes up each time.
> > >
> > > Jason
> > >
> > >
> > > On Thu, Aug 29, 2013 at 11:58 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> > >
> > > > Thanks, Neha. That is a great answer.
> > > >
> > > > Regards,
> > > >
> > > > Libo
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Neha Narkhede [mailto:[EMAIL PROTECTED]]
> > > > Sent: Thursday, August 29, 2013 1:55 PM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: Re: is it possible to commit offsets on a per stream basis?
> > > >
> > > > 1 We can create multiple connectors. From each connector create only
> > one
> > > > stream.
> > > > 2 Use a single thread for a stream. In this case, the connector in
> each
> > > > thread can commit freely without any dependence on the other threads.
> >  Is
> > > > this the right way to go? Will it introduce any dead lock when
> multiple
> > > > connectors commit at the same time?
> > > >
> > > > This is a better approach as there is no complex locking involved.
> > > >
> > > > Thanks,
> > > > Neha
> > > >
> > > >
> > > > On Thu, Aug 29, 2013 at 10:28 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > Hi team,
> > > > >
> > > > > This is our current use case:
> > > > > Assume there is a topic with multiple partitions.
> > > > > 1 Create a connector first and create multiple streams from the
> > > > > connector for a topic.
> > > > > 2 Create multiple threads, one for each stream. You can assume the
> > > > > thread's job is to save the message into the database.
> > > > > 3 When it is time to commit offsets, all threads have to
> synchronize
> > > > > on a barrier before committing the offsets. This is to ensure no
> > > > > message loss in case of process crash.
> > > > >
> > > > > As all threads need to synchronize before committing, it is not
> > > > efficient.
> > > > > This is a workaround:
> > > > >
> > > > > 1 We can create multiple connectors. From each connector create

 
+
Neha Narkhede 2013-09-09, 16:17
+
Jason Rosenberg 2013-10-03, 20:13