Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka, mail # user - async producer behavior if zk and/or kafka cluster goes away...


Copy link to this message
-
Re: async producer behavior if zk and/or kafka cluster goes away...
Neha Narkhede 2012-11-20, 19:18
Docs are not updated since 0.8 is not yet released.

Thanks,
Neha

On Tue, Nov 20, 2012 at 11:09 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:
> Is there a configuration doc page for 0.8 (since apparently there are some
> new settings)?
>
> Jason
>
> On Tue, Nov 20, 2012 at 10:39 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
>> That's right. VIP is only used for getting metadata. All producer send
>> requests are through direct RPC to each broker.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Nov 20, 2012 at 10:28 AM, Jason Rosenberg <[EMAIL PROTECTED]>
>> wrote:
>>
>> > Ok,
>> >
>> > I think I understand (so I'll need to change some things in our set up to
>> > work with 0.8).
>> >
>> > So the VIP is only for getting meta-data?  After that, under the covers,
>> > the producers will make direct connections to individual kafka hosts that
>> > they learned about from connecting through the VIP?
>> >
>> > Jason
>> >
>> > On Tue, Nov 20, 2012 at 10:20 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>> >
>> > > I think the confusion is that we are answering a slightly different
>> > > question then what you are asking. If I understand you are asking, "do
>> I
>> > > need to put ALL the kafka broker urls into the config for the client
>> and
>> > > will this need to be updated if I add machines to the cluster?".
>> > >
>> > > The answer to both these questions is no. The broker list configuration
>> > > will work exactly as your zookeeper configuration worked. Namely you
>> must
>> > > have the URL of at least one operational broker in the cluster, and the
>> > > producer will use this/these urls to fetch a complete topology of the
>> > > cluster (all nodes, and what partitions they have). If you add kafka
>> > > brokers or migrate partitions from one broker to another clients will
>> > > automatically discover this and adjust appropriately with no need for
>> > > config changes. The brokerlist you give is only used when fetching
>> > > metadata, all producer requests go directly to the appropriate broker.
>> > As a
>> > > result you can use a VIP for the broker list if you like, without
>> having
>> > > any of the actual data you send go through that VIP.
>> > >
>> > > As Neha and Jun mentioned there were a couple of reasons for this
>> change:
>> > > 1. If you use kafka heavily everything ends up connecting to zk and any
>> > > operational change to zk or upgrade because immensely difficult.
>> > > 2. Zk support outside java is spotty at best.
>> > > 3. In effect we were using zk for what it is good at--discover--because
>> > > discovery is asynchronous. That is if you try to send to the wrong
>> broker
>> > > we need to give you an error right away and have you update your
>> > metadata,
>> > > and this will likely happen before the zk watcher fires. Plus once you
>> > > handle this case you don't need the watcher. As a result zk is just
>> being
>> > > used as a key-value store.
>> > >
>> > > -Jay
>> > >
>> > >
>> > >
>> > > On Tue, Nov 20, 2012 at 9:44 AM, Jason Rosenberg <[EMAIL PROTECTED]>
>> > wrote:
>> > >
>> > > > Ok,
>> > > >
>> > > > So, I'm still wrapping my mind around this.  I liked being able to
>> use
>> > zk
>> > > > for all clients, since it made it very easy to think about how to
>> > update
>> > > > the kafka cluster.  E.g. how to add new brokers, how to move them all
>> > to
>> > > > new hosts entirely, etc., without having to redeploy all the clients.
>> > >  The
>> > > > new brokers will simply advertise their new location via zk, and all
>> > > > clients will pick it up.
>> > > >
>> > > > By requiring use of a configured broker.list for each client, means
>> > that
>> > > > 1000's of deployed apps need to be updated any time the kafka cluster
>> > > > changes, no?  (Or am I not understanding?).
>> > > >
>> > > > You mention that auto-discovery of new brokers will still work, is
>> that
>> > > > dependent on the existing configured broker.list set still being
>> > > available
>> > > > also?
>> > > >
>> > > > I can see though how this will greatly reduce the load on zookeeper.