One thing you could do is the following:
Every time you consume a message, you send it asynchronously to a pool of
actor that will process the message.
When the actors fail to process a message (or if it takes them longer than
a certain arbitrary time-out period), they republish it to another topic
and move on. You can then have another consumer that takes care of those
problematic messages, while the main consumer keeps going with the easy to
This would allow you to not be blocked by messages that take a long time to
However, this system is not as reliable, because the Kafka consumer will
assume that a message is correctly consumed (and move the offset
accordingly) as soon as it is sent to the actor pool, even though the
message is not necessarily processed yet. In certain failure scenarios,
some messages would end up never being consumed at all.
Another limitation is that such system doesn't allow you to ensure that the
actors process the messages in the same order Kafka consumes them.
I guess it could work for certain use cases...
On Thu, Sep 13, 2012 at 5:10 PM, Vito Sabella <[EMAIL PROTECTED]> wrote:
> ** Resending due to outlook.com corrupting the original message. Sorry :)
> I'm looking to use kafka in a pub-sub model where a consumer reads
> from Kafka and does some processing on the message. How would you
> recommend a commit to Zookeeper / setting the last message consumed
> location if processing one of the messages in the pipe is more
> unreliable than the others.
> Let's say I read a batch of 10 messages (1-10) and I successfully
> process messages 1-8 and 10 quickly, but message #9 is taking an
> inordinately long time to process. I don't want to write the message
> as consumed against Zookeeper but I also don't want to block forward
> progress of the pipeline for that topic/partition.
> In the edge case, let's say processing all messages was successful but
> processing message #9 totally failed/timed out, but a reattempt at
> processing it would not result in failure (web-service call, strange
> network condition). In the timeout/failure case would the suggestion
> be to re-queue the message?
> Does anyone have any recommendations for the above scenarios?