Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: is it possible to commit offsets on a per stream basis?


Copy link to this message
-
Re: is it possible to commit offsets on a per stream basis?
That should be fine too.
On Sat, Sep 7, 2013 at 8:33 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> To be clear, it looks like I forgot to add to my question, that I am asking
> about creating multiple connectors, within the same consumer process (as I
> realize I can obviously have multiple connectors running on multiple hosts,
> etc.).  But I'm guessing that should be fine too?
>
> Jason
>
>
>
>
> On Sat, Sep 7, 2013 at 3:09 PM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
>
> > >> Can I create multiple connectors, and have each use the same Regex
> > for the TopicFilter?  Will each connector share the set of available
> > topics?  Is this safe to do?
> >
> > >> Or is it necessary to create mutually non-intersecting regex's for
> each
> > connector?
> >
> > As long as each of those consumer connectors share the same group id,
> Kafka
> > consumer rebalancing should automatically re-distribute the
> > topic/partitions amongst the consumer connectors/streams evenly.
> >
> > Thanks,
> > Neha
> >
> >
> > On Mon, Sep 2, 2013 at 1:35 PM, Jason Rosenberg <[EMAIL PROTECTED]>
> wrote:
> >
> > > Will this work if we are using a TopicFilter, that can map to multiple
> > > topics.  Can I create multiple connectors, and have each use the same
> > Regex
> > > for the TopicFilter?  Will each connector share the set of available
> > > topics?  Is this safe to do?
> > >
> > > Or is it necessary to create mutually non-intersecting regex's for each
> > > connector?
> > >
> > > It seems I have a similar issue.  I have been using auto commit mode,
> but
> > > it doesn't guarantee that all messages committed have been successfully
> > > processed (seems a change to the connector itself might expose a way to
> > use
> > > auto offset commit, and have it never commit a message until it is
> > > processed).  But that would be a change to the
> > > ZookeeperConsumerConnector....Essentially, it would be great if after
> > > processing each message, we could mark the message as 'processed', and
> > thus
> > > use that status as the max offset to commit when the auto offset commit
> > > background thread wakes up each time.
> > >
> > > Jason
> > >
> > >
> > > On Thu, Aug 29, 2013 at 11:58 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> > >
> > > > Thanks, Neha. That is a great answer.
> > > >
> > > > Regards,
> > > >
> > > > Libo
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Neha Narkhede [mailto:[EMAIL PROTECTED]]
> > > > Sent: Thursday, August 29, 2013 1:55 PM
> > > > To: [EMAIL PROTECTED]
> > > > Subject: Re: is it possible to commit offsets on a per stream basis?
> > > >
> > > > 1 We can create multiple connectors. From each connector create only
> > one
> > > > stream.
> > > > 2 Use a single thread for a stream. In this case, the connector in
> each
> > > > thread can commit freely without any dependence on the other threads.
> >  Is
> > > > this the right way to go? Will it introduce any dead lock when
> multiple
> > > > connectors commit at the same time?
> > > >
> > > > This is a better approach as there is no complex locking involved.
> > > >
> > > > Thanks,
> > > > Neha
> > > >
> > > >
> > > > On Thu, Aug 29, 2013 at 10:28 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > Hi team,
> > > > >
> > > > > This is our current use case:
> > > > > Assume there is a topic with multiple partitions.
> > > > > 1 Create a connector first and create multiple streams from the
> > > > > connector for a topic.
> > > > > 2 Create multiple threads, one for each stream. You can assume the
> > > > > thread's job is to save the message into the database.
> > > > > 3 When it is time to commit offsets, all threads have to
> synchronize
> > > > > on a barrier before committing the offsets. This is to ensure no
> > > > > message loss in case of process crash.
> > > > >
> > > > > As all threads need to synchronize before committing, it is not
> > > > efficient.
> > > > > This is a workaround:
> > > > >
> > > > > 1 We can create multiple connectors. From each connector create

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB