Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Consuming from X days ago & issues consuming from the beginning of time


Copy link to this message
-
Consuming from X days ago & issues consuming from the beginning of time
Hey guys,

I've come across this behavior with the hadoop-consumer, but it certainly
applies to any consumer.

We've had our brokers up and running for about 9 days, with a 7-day
retention policy. (3 brokers with 3 partitions each)
I've just deployed a new hadoop consumer and wanted to read from the
beginning of time (7-days ago).

Here's the behavior I'm seeing:
- I tell the consumer to start from 0
- It queries the partition, finds the minimum available is 2000000, so it
starts there
- It starts consuming from 2000000+
- It throws an exception ("kafka.common.OffsetOutOfRangeException") because
that message was deleted already

Through sheer luck, after a few task failures the job managed to beat this
race condition, but it begs the question:

- How would I tell a consumer to start querying from T-4days? That would
totally solve the issue. I don't really need a full 7 days, but I have no
way to resolve time -> offset
(this is useful if people are tailing the events too, so they can tail
events from 3 days ago grepping for something)

Any ideas? Anyone else experienced this?
--
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
[EMAIL PROTECTED] | @rathboma <http://twitter.com/rathboma> |
4sq<http://foursquare.com/rathboma>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB