I don't think there is currently a way to detect this condition
apart from alerting off consumer metrics or logs.
However, I'm not sure it can be called a "fatal" condition in that
the brokers could re-register in zookeeper and consumption would then
resume; unless someone decides to move a Kafka cluster to some
other zookeeper namespace without telling anyone.
What would be a suitable action on the application-side if such a
condition were propagated back to the application as an exception?
On Tue, Jan 07, 2014 at 06:00:29PM +0000, Paul Mackles wrote:
> Hi - I noticed that if a kafka cluster goes away entirely, the high-level consumer will endlessly try to fetch metadata until the cluster comes back up, never bubbling the error condition up to the application. While I see a setting to control the interval at which it reconnects, I don't see anything to tell it when to just give up. I think it would be useful if there were a way for the application to detect this condition and possibly take some sort of action. Either a max-retries setting and/or some sort of flag that can be tested after a timeout. Is that capability already there? Is there a known workaround for this?