Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> RE: storing last processed offset, recovery of failed message processing etc.


Copy link to this message
-
Re: storing last processed offset, recovery of failed message processing etc.
When using ZK to keep track of last offsets metrics etc., how do you know
when you are pushing your ZK cluster to its limit?

Or can ZK handle thousands of writes/reads per second no problem since it
is all in-memory?  But even so, you need some idea on its upper limits and
how close you are to that limit etc.
On Mon, Dec 9, 2013 at 3:31 PM, Philip O'Toole <[EMAIL PROTECTED]> wrote:

> We use Zookeeper, as is standard with Kafka.
>
> Our systems are idempotent, so we only store offsets when the message is
> fully processed. If this means we occasionally replay a message due to some
> corner-case, or simply a restart, it doesn't matter.
>
> Philip
>
>
> On Mon, Dec 9, 2013 at 12:28 PM, S Ahmed <[EMAIL PROTECTED]> wrote:
>
> > I was hoping people could comment on how they handle the following
> > scenerios:
> >
> > 1. Storing the last successfully processed messageId/Offset.  Are people
> > using mysql, redis, etc.?  What are the tradeoffs here?
> >
> > 2. How do you handle recovering from an error while processesing a given
> > event?
> >
> > There are various scenerioes for #2, like:
> > 1. Do you mark the start of processing a message somewhere, and then
> update
> > the status to complete and THEN update the last messaged processed for
> #1?
> > 2. Do you only mark the status as complete, and not the start of
> processing
> > it?  I guess this depends of there are intermediate steps and processing
> > the entire message again would result in some duplicated work right?
> >
>