Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: is it possible to commit offsets on a per stream basis?


Copy link to this message
-
Re: is it possible to commit offsets on a per stream basis?
Will this work if we are using a TopicFilter, that can map to multiple
topics.  Can I create multiple connectors, and have each use the same Regex
for the TopicFilter?  Will each connector share the set of available
topics?  Is this safe to do?

Or is it necessary to create mutually non-intersecting regex's for each
connector?

It seems I have a similar issue.  I have been using auto commit mode, but
it doesn't guarantee that all messages committed have been successfully
processed (seems a change to the connector itself might expose a way to use
auto offset commit, and have it never commit a message until it is
processed).  But that would be a change to the
ZookeeperConsumerConnector....Essentially, it would be great if after
processing each message, we could mark the message as 'processed', and thus
use that status as the max offset to commit when the auto offset commit
background thread wakes up each time.

Jason
On Thu, Aug 29, 2013 at 11:58 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:

> Thanks, Neha. That is a great answer.
>
> Regards,
>
> Libo
>
>
> -----Original Message-----
> From: Neha Narkhede [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, August 29, 2013 1:55 PM
> To: [EMAIL PROTECTED]
> Subject: Re: is it possible to commit offsets on a per stream basis?
>
> 1 We can create multiple connectors. From each connector create only one
> stream.
> 2 Use a single thread for a stream. In this case, the connector in each
> thread can commit freely without any dependence on the other threads.  Is
> this the right way to go? Will it introduce any dead lock when multiple
> connectors commit at the same time?
>
> This is a better approach as there is no complex locking involved.
>
> Thanks,
> Neha
>
>
> On Thu, Aug 29, 2013 at 10:28 AM, Yu, Libo <[EMAIL PROTECTED]> wrote:
>
> > Hi team,
> >
> > This is our current use case:
> > Assume there is a topic with multiple partitions.
> > 1 Create a connector first and create multiple streams from the
> > connector for a topic.
> > 2 Create multiple threads, one for each stream. You can assume the
> > thread's job is to save the message into the database.
> > 3 When it is time to commit offsets, all threads have to synchronize
> > on a barrier before committing the offsets. This is to ensure no
> > message loss in case of process crash.
> >
> > As all threads need to synchronize before committing, it is not
> efficient.
> > This is a workaround:
> >
> > 1 We can create multiple connectors. From each connector create only
> > one stream.
> > 2 Use a single thread for a stream. In this case, the connector in
> > each thread can commit freely without any dependence on the other
> > threads.  Is this the right way to go? Will it introduce any dead lock
> > when multiple connectors commit at the same time?
> >
> > It would be great to allow committing on a per stream basis.
> >
> > Regards,
> >
> > Libo
> >
> >
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB