Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> consumer may lose data


Copy link to this message
-
consumer may lose data
Hi there this is my first experience w/kafka.  We've deployed it in production (soft release) and using it to create a realtime stream of data--we love it so far.

Running in production we are seeing these types of messages every once in a while:  

[2012-06-09 09:02:36,051] ERROR [pool-1-thread-4] (ConsumerIterator.scala 74) - consumed offset: 22013667532 doesn't match fetch offset: 21008498593 for firehose:1-23: fetched offset = 22013667532: consumed offset = 22013667532;
 Consumer may lose data
[2012-06-09 09:22:48,520] ERROR [pool-1-thread-4] (ConsumerIterator.scala 74) - consumed offset: 21013192930 doesn't match fetch offset: 21475567914 for firehose:1-3: fetched offset = 22021419503: consumed offset = 21013192930;
 Consumer may lose data
[2012-06-09 09:42:34,342] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21017992042 doesn't match fetch offset: 21477363255 for firehose:1-5: fetched offset = 22029075985: consumed offset = 21017992042;
 Consumer may lose data
[2012-06-09 09:46:50,599] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21017498912 doesn't match fetch offset: 21476883323 for firehose:1-7: fetched offset = 22022716494: consumed offset = 21017498912;
 Consumer may lose data
[2012-06-09 09:50:54,912] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21016428723 doesn't match fetch offset: 21475750245 for firehose:1-4: fetched offset = 22027573299: consumed offset = 21016428723;
 Consumer may lose data
[2012-06-09 09:58:29,709] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21017643906 doesn't match fetch offset: 21477006308 for firehose:1-6: fetched offset = 22025778964: consumed offset = 21017643906;
 Consumer may lose data
[2012-06-09 09:59:04,622] ERROR [pool-1-thread-4] (ConsumerIterator.scala 74) - consumed offset: 21017419393 doesn't match fetch offset: 21476749439 for firehose:1-23: fetched offset = 22025584690: consumed offset = 21017419393;

I am a bit unsure what kafka does when the consumed offset doesn't match the fetch offset.  We are using a pool of threads to consume each stream created by ConsumerConnector.createMessageStreams().  Is this kosher?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB