Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Proper use of ConsumerConnector


Copy link to this message
-
Re: Proper use of ConsumerConnector
Tom,

That is a good suggestion. Some of us started thinking about re-designing
the consumer client a while ago and wrote up some ideas here -
https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design.
In addition to this, we have a working prototype of stage 1 of that
re-design here
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Detailed+Consumer+Coordinator+Design

Besides this, work has started on scaling the offset storage for the
consumer as part of this JIRA -
https://issues.apache.org/jira/browse/KAFKA-657. It is true that the team
is currently focussed on developing and stabilizing replication, but we
welcome ideas and contribution to the consumer client re-design project as
well.

Thanks,
Neha
On Fri, Dec 21, 2012 at 5:44 PM, Tom Brown <[EMAIL PROTECTED]> wrote:

> It seems that a common thread is that while ConsumerConnector works
> well for the standard case, it just doesn't work for any case where
> manual offset management (explicit checkpoints, rollbacks, etc) is
> required.
>
> If any Kafka devs are looking for a way to improve it, I think
> modifying it to be more modular regarding offset management would be
> great! You could provide an interface for loading/committing offsets,
> then provide a ZK implementation as the default. It would be backwards
> compatible, but be useful in all of the use cases where explicit
> offset management is required.
>
> (of course, I know I'm just an armchair kafka dev, so there may be
> reasons why this won't work, or would be an extremely low priorty,
> or...)
>
> --Tom
>
> On Fri, Dec 21, 2012 at 4:12 PM, Yonghui Zhao <[EMAIL PROTECTED]>
> wrote:
> > In our project we use senseidb to consume kafka data. Senseidb will
> process the message immediately but won't flush to disk immeidately. So if
> senseidb crash then all result not flushed will be lost, we want to rewind
> kafka. The offset we want to rewind to is the flush checkpoint.
> > In this case, we will lost some data
> >
> > Sent from my iPad
> >
> > 在 2012-12-22,1:37,Neha Narkhede <[EMAIL PROTECTED]> 写道:
> >
> >>
> >>  But if crash happens just after offset committed, then unprocessed
> message in consumer will be skipped after reconnected.
> >>
> >> If the consumer crashes, you will get duplicates, not lose any data.
> >>
> >> Thanks,
> >> Neha
> >>
>

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB