|
|
David Arthur 2012-11-08, 14:22
Suppose I'm iterating through messages from a topic with 6 messages. I consume message three messages and then commit (or an auto-commit happens).
message 1 message 2 message 3 <--- commit message 4 message 5 message 6
I'm wondering what offset is committed here. Is it the beginning of message 3 or the end of message 3?
I'm particularly curious about what happens if I'm consuming messages and a commit occurs, but then something goes wrong and I fail to process that message (whatever "process" means here). Ideally, if I commit in the above scenario and then my consumer dies, the consumers will rebalance and someone else will pick up message 3. Is this what happens?
This also leads me to ponder about corrupt messages and retries and skipping "bad" messages...
Cheers, David
Jun Rao 2012-11-08, 15:31
Once message 3 is returned in next(), offset moves to message 4. So, if the offset is committed and the consumer crashes, another consumer resumes consumption from message 4. If auto commit is enabled, the offset is committed periodically. If you use manual offset commit, you can commit the offset after the message is truly consumed.
Thanks,
Jun
On Thu, Nov 8, 2012 at 6:22 AM, David Arthur <[EMAIL PROTECTED]> wrote:
> Suppose I'm iterating through messages from a topic with 6 messages. I > consume message three messages and then commit (or an auto-commit happens). > > message 1 > message 2 > message 3 > <--- commit > message 4 > message 5 > message 6 > > I'm wondering what offset is committed here. Is it the beginning of > message 3 or the end of message 3? > > I'm particularly curious about what happens if I'm consuming messages and > a commit occurs, but then something goes wrong and I fail to process that > message (whatever "process" means here). Ideally, if I commit in the above > scenario and then my consumer dies, the consumers will rebalance and > someone else will pick up message 3. Is this what happens? > > This also leads me to ponder about corrupt messages and retries and > skipping "bad" messages... > > Cheers, > David
David Arthur 2012-11-08, 16:32
Thanks, Jun. This is what I was looking for. I did a little experimentation and found that if I crashed the JVM immediately after consuming a message, the offset is not always changed. I'm guessing this is due to the offsets being committed by a separate thread and lucky timing.
On Nov 8, 2012, at 10:31 AM, Jun Rao wrote:
> Once message 3 is returned in next(), offset moves to message 4. So, if the > offset is committed and the consumer crashes, another consumer resumes > consumption from message 4. If auto commit is enabled, the offset is > committed periodically. If you use manual offset commit, you can commit the > offset after the message is truly consumed. > > Thanks, > > Jun > > On Thu, Nov 8, 2012 at 6:22 AM, David Arthur <[EMAIL PROTECTED]> wrote: > >> Suppose I'm iterating through messages from a topic with 6 messages. I >> consume message three messages and then commit (or an auto-commit happens). >> >> message 1 >> message 2 >> message 3 >> <--- commit >> message 4 >> message 5 >> message 6 >> >> I'm wondering what offset is committed here. Is it the beginning of >> message 3 or the end of message 3? >> >> I'm particularly curious about what happens if I'm consuming messages and >> a commit occurs, but then something goes wrong and I fail to process that >> message (whatever "process" means here). Ideally, if I commit in the above >> scenario and then my consumer dies, the consumers will rebalance and >> someone else will pick up message 3. Is this what happens? >> >> This also leads me to ponder about corrupt messages and retries and >> skipping "bad" messages... >> >> Cheers, >> David
|
|