Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> async producer behavior if zk and/or kafka cluster goes away...

Copy link to this message
Re: async producer behavior if zk and/or kafka cluster goes away...
Trunk does not have latest 0.8 code yet. We plan to merge 0.8 back
into trunk soon, but it hasn't happened yet

Typically, the number of producers to a production Kafka clusters is
very large, which means large number of connections
to zookeeper. If there is a slight blip on the zookeeper cluster due
to network error, disk latency or GC, this can cause
a lot of churn as zookeeper will now try to expire ~10s of thousands
of zk sessions.

Basically, you want zookeeper on the producer to do just one thing -
notify the change in the liveness of brokers in Kafka
cluster. In 0.8, brokers are not the entity to worry about, what we
care about are replicas for the partitions that the producer
is sending data to, in particular just the leader replica (since only
the leader can accept writes for a partition)

The producer keeps a cache of (topic, partition) -> leader-replica.
Now, if that cache is either empty or stale (due to changes
on the Kafka cluster), the next produce request will get an ACK with
an error code NotLeaderForPartition. That's when it
fires the getMetadata request that refreshes its cache. Assuming
you've configured your producer to rety (producer.num.retries)
more than once, it will succeed sending data the next time.

In other words, instead of zookeeper 'notifying' us of the changes on
the Kafka cluster, we let the producer lazily update its
cache by invoking a special API on any of the Kafka brokers. That way,
we have much fewer connections to zk, zk upgrades
are easier, so are upgrades to the producer and we also achieve the
goal of replica discovery.

As for the asymmetry between producers and consumers, we have a
proposal and some initial code written to address that in a
future release -


On Tue, Nov 20, 2012 at 7:57 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:
> I checked out trunk.  I guess I assumed that included the latest 0.8.  Is
> that not right?  Am I just looking at 0.7.x+?
> Honestly, I don't think it would be a positive thing not to be able to rely
> on zookeeper in producer code.  How does that affect the discovery of a
> kafka cluster under dynamic conditions?  We'd expect to have a much higher
> SLA for the zookeeper cluster than for kafka.  We'd like to be able to
> freely do rolling restarts of the kafka cluster, etc.
> Also, it seems a bit asymetric to use zk for the kafka brokers and
> consumers, but not the producers.
> Jason
> On Mon, Nov 19, 2012 at 8:50 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>> In 0.8 there is no way to use zookeeper from the producer and no connection
>> from the client. There isn't even a way to configure a zk connection. Are
>> you sure you checked out the 0.8 branch?
>> Check the code you've got:
>> *jkreps-mn:kafka-0.8 jkreps$ svn info*
>> *Path: .*
>> *URL: https://svn.apache.org/repos/asf/incubator/kafka/branches/0.8*
>> *Repository Root: https://svn.apache.org/repos/asf*
>> The key is that it should have come from the URL kafka/branches/0.8.
>> -Jay
>> On Mon, Nov 19, 2012 at 3:30 PM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:
>> > Regarding the poducer/zk connection:  if I am using zk to discover the
>> > kafka cluster, doesn't the producer get updates if zk's knowledge of the
>> > cluster changes?  Or does it only reconsult zk if the particular kafka
>> node
>> > it was "getting metadata" from goes away?  Should I not be using a
>> > "zk.connect" but instead a "broker.list" when using a producer (that
>> would
>> > seem restrictive)?  What I've noticed is that the instant the zk server
>> is
>> > taken down, my producer immediately starts logging connection errors to
>> zk,
>> > every second, and never stops this logging until zk comes back.  So it
>> > certainly feels like the producer is attempting to maintain a direct
>> > connection to zk.  I suppose I expected it to try for the connection
>> > timeout period (e.g. 6000ms), and then give up, until the next send