Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> getOffsetsBefore issue

Copy link to this message
Re: getOffsetsBefore issue
Hi Raymond,

getOffsetsBefore is "approximate" in that it is based on the mtime of the
log segments. i.e., currently it is not possible to drill down on the exact
offsets that correspond to a timestamp. You can set log.roll.hours to a
lower period which would generally result in a larger number of log
segments for a given time period.



On Wed, Sep 26, 2012 at 6:45 AM, Raymond Ng <[EMAIL PROTECTED]> wrote:

> Hi
> I'm using kafka 0.7.1 and I'm publishing batched messages with GZIP
> compression
> when calling SimpleConsumer.getOffsetsBefore(topic, partition, (new
> Date()).getTime(), 10) it returns only 2 offsets (0 and 328317180) which is
> the start and end byte of the file,  it doesn't return any of the offsets
> in between.  I've run the DumpLogSegments tool against the kafka file and
> picked up > 2000 offsets, is it to do with GZIP compression?
> also is the method capable of returning offsets before a specific time at
> all? the unit test I've found in
> https://svn.apache.org/repos/asf/incubator/kafka/trunk/core/src/test/scala/unit/kafka/log/LogOffsetTest.scala
> seems to only test OffsetRequest.EarliestTime and OffsetRequest.LatestTime,
> nothing on time in between
> thanks
> --
> Rgds
> Ray