Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - async producer behavior if zk and/or kafka cluster goes away...


Copy link to this message
-
Re: async producer behavior if zk and/or kafka cluster goes away...
Jay Kreps 2012-11-20, 04:50
In 0.8 there is no way to use zookeeper from the producer and no connection
from the client. There isn't even a way to configure a zk connection. Are
you sure you checked out the 0.8 branch?

Check the code you've got:
*jkreps-mn:kafka-0.8 jkreps$ svn info*
*Path: .*
*URL: https://svn.apache.org/repos/asf/incubator/kafka/branches/0.8*
*Repository Root: https://svn.apache.org/repos/asf*

The key is that it should have come from the URL kafka/branches/0.8.

-Jay
On Mon, Nov 19, 2012 at 3:30 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Regarding the poducer/zk connection:  if I am using zk to discover the
> kafka cluster, doesn't the producer get updates if zk's knowledge of the
> cluster changes?  Or does it only reconsult zk if the particular kafka node
> it was "getting metadata" from goes away?  Should I not be using a
> "zk.connect" but instead a "broker.list" when using a producer (that would
> seem restrictive)?  What I've noticed is that the instant the zk server is
> taken down, my producer immediately starts logging connection errors to zk,
> every second, and never stops this logging until zk comes back.  So it
> certainly feels like the producer is attempting to maintain a direct
> connection to zk.  I suppose I expected it to try for the connection
> timeout period (e.g. 6000ms), and then give up, until the next send
> request, etc.
>
> Perhaps what it should do is make that initial zk connection to find the
> kafka broker list, then shutdown the zk connection if it really doesn't
> need it after that, until possibly recreating it if needed if it can no
> longer make contact with the kafka cluster.
>
> For the async queuing behavior, I agree, it's difficult to respond to a
> send request with an exception, when the sending is done asynchronously, in
> a different thread.  However, this is the behavior when the producer is
> started initially, with no zk available (e.g. producer.send() gets an
> exception).  So, the api is inconsistent, in that it treats the
> unavailability of zk differently, depending on whether it was unavailable
> at the initial startup, vs. a subsequent zk outage after previously having
> been available.
>
> I am not too concerned about not having 100% guarantee that if I
> successfully call producer.send(), that I know it was actually delivered.
>  But it would be nice to have some way to know the current health of the
> producer, perhaps some sort of "producerStatus()" method.  If the async
> sending thread is having issues sending, it might be nice to expose that to
> the client.  Also, if the current producerStatus() is not healthy, then I
> think it might be ok to not accept new messages to be sent (e.g.
> producer.send() could throw an Exception in that case).
>
> Returning a Future for each message sent seems a bit unscalable.....I'm not
> sure clients want to be tying up resources waiting for Futures all the time
> either.
>
> I'm also seeing that if  kafka goes down, while zk stays up, subsequent
> calls to producer.send() fail immediately with an exception ("partition is
> null").  I think this makes sense, although, in that case, what is the fate
> of previously buffered but unsent messages?  Are they all lost?
>
> But I'd like it if zk goes down, and then kafka goes down, it would behave
> the same way as if only kafka went down.  Instead, it continues happily
> buffering messages, with lots of zk connection errors logged, but no way
> for the client code to know that things are not hunky dory.
>
> In summary:
>
> 1. If zk connection is not important for a producer, why continually log zk
> connection errors every second, while at the same time have the client
> behave as if nothing is wrong and just keep accepting messages.
> 2. if zk connection goes down, followed by kafka going down, it behaves no
> differently than if only zk went down, from the client's perspective (it
> keeps accepting new messages).
> 3. if zk connection stays up, but kafka goes down, it then fails
> immediately with an exception in a call to producer.send (I think this