Thanks Jay, this is helpful information. Controlling when offsets are committed is possible only with the SimpleConsumer, correct? I believe that the ConsumerConnector (the high level consumer) commits the offsets automatically. It would be nice to have a hook into it that I can override to let it know when I finished processing a message so it can commit an offset. Is there an improvement request submitted for this already?
On Jul 7, 2013, at 2:35 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> The consumers position is controlled using a saved "offset" that marks its
> position in the topic/partition it is reading. This position is
> periodically updated. If you update the saved offset before processing
> messages you have the possibility of message loss if your consumer crashes
> before processing the messages. If you update the offset after processing
> the messages you have the possibility of duplicate messages when your
> consumer restarts as it will reprocess a few message it has already seen.
> You can control when the position is saved by calling commit().