Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> async producer behavior if zk and/or kafka cluster goes away...


+
Jason Rosenberg 2012-11-19, 21:23
+
Jason Rosenberg 2012-11-19, 21:27
+
Jay Kreps 2012-11-19, 22:31
+
Jason Rosenberg 2012-11-19, 23:30
+
Jay Kreps 2012-11-20, 04:50
+
Jason Rosenberg 2012-11-20, 15:57
+
Neha Narkhede 2012-11-20, 16:41
+
Jun Rao 2012-11-20, 17:03
+
Jason Rosenberg 2012-11-20, 17:44
+
Neha Narkhede 2012-11-20, 18:00
+
Jason Rosenberg 2012-11-20, 18:04
+
Jay Kreps 2012-11-20, 18:20
+
Jason Rosenberg 2012-11-20, 18:28
+
Neha Narkhede 2012-11-20, 18:35
Copy link to this message
-
Re: async producer behavior if zk and/or kafka cluster goes away...
That's right. VIP is only used for getting metadata. All producer send
requests are through direct RPC to each broker.

Thanks,

Jun

On Tue, Nov 20, 2012 at 10:28 AM, Jason Rosenberg <[EMAIL PROTECTED]> wrote:

> Ok,
>
> I think I understand (so I'll need to change some things in our set up to
> work with 0.8).
>
> So the VIP is only for getting meta-data?  After that, under the covers,
> the producers will make direct connections to individual kafka hosts that
> they learned about from connecting through the VIP?
>
> Jason
>
> On Tue, Nov 20, 2012 at 10:20 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
> > I think the confusion is that we are answering a slightly different
> > question then what you are asking. If I understand you are asking, "do I
> > need to put ALL the kafka broker urls into the config for the client and
> > will this need to be updated if I add machines to the cluster?".
> >
> > The answer to both these questions is no. The broker list configuration
> > will work exactly as your zookeeper configuration worked. Namely you must
> > have the URL of at least one operational broker in the cluster, and the
> > producer will use this/these urls to fetch a complete topology of the
> > cluster (all nodes, and what partitions they have). If you add kafka
> > brokers or migrate partitions from one broker to another clients will
> > automatically discover this and adjust appropriately with no need for
> > config changes. The brokerlist you give is only used when fetching
> > metadata, all producer requests go directly to the appropriate broker.
> As a
> > result you can use a VIP for the broker list if you like, without having
> > any of the actual data you send go through that VIP.
> >
> > As Neha and Jun mentioned there were a couple of reasons for this change:
> > 1. If you use kafka heavily everything ends up connecting to zk and any
> > operational change to zk or upgrade because immensely difficult.
> > 2. Zk support outside java is spotty at best.
> > 3. In effect we were using zk for what it is good at--discover--because
> > discovery is asynchronous. That is if you try to send to the wrong broker
> > we need to give you an error right away and have you update your
> metadata,
> > and this will likely happen before the zk watcher fires. Plus once you
> > handle this case you don't need the watcher. As a result zk is just being
> > used as a key-value store.
> >
> > -Jay
> >
> >
> >
> > On Tue, Nov 20, 2012 at 9:44 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> wrote:
> >
> > > Ok,
> > >
> > > So, I'm still wrapping my mind around this.  I liked being able to use
> zk
> > > for all clients, since it made it very easy to think about how to
> update
> > > the kafka cluster.  E.g. how to add new brokers, how to move them all
> to
> > > new hosts entirely, etc., without having to redeploy all the clients.
> >  The
> > > new brokers will simply advertise their new location via zk, and all
> > > clients will pick it up.
> > >
> > > By requiring use of a configured broker.list for each client, means
> that
> > > 1000's of deployed apps need to be updated any time the kafka cluster
> > > changes, no?  (Or am I not understanding?).
> > >
> > > You mention that auto-discovery of new brokers will still work, is that
> > > dependent on the existing configured broker.list set still being
> > available
> > > also?
> > >
> > > I can see though how this will greatly reduce the load on zookeeper.
> > >
> > > Jason
> > >
> > >
> > >
> > > On Tue, Nov 20, 2012 at 9:03 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > >
> > > > Jason,
> > > >
> > > > Auto discovery of new brokers and rolling restart of brokers are
> still
> > > > supported in 0.8. It's just that most of the ZK related logic is
> moved
> > to
> > > > the broker.
> > > >
> > > > There are 2 reasons why we want to remove zkclient from the client.
> > > >
> > > > 1. If the client goes to GC, it can cause zk session expiration and
> > cause
> > > > churns in the client and extra load on the zk server.
+
Jason Rosenberg 2012-11-20, 19:09
+
Neha Narkhede 2012-11-20, 19:18