Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # user >> async producer behavior if zk and/or kafka cluster goes away...


+
Jason Rosenberg 2012-11-19, 21:23
+
Jason Rosenberg 2012-11-19, 21:27
+
Jay Kreps 2012-11-19, 22:31
+
Jason Rosenberg 2012-11-19, 23:30
+
Jay Kreps 2012-11-20, 04:50
+
Jason Rosenberg 2012-11-20, 15:57
+
Neha Narkhede 2012-11-20, 16:41
+
Jun Rao 2012-11-20, 17:03
+
Jason Rosenberg 2012-11-20, 17:44
+
Neha Narkhede 2012-11-20, 18:00
+
Jason Rosenberg 2012-11-20, 18:04
+
Jay Kreps 2012-11-20, 18:20
+
Jason Rosenberg 2012-11-20, 18:28
+
Neha Narkhede 2012-11-20, 18:35
+
Jun Rao 2012-11-20, 18:39
Copy link to this message
-
Re: async producer behavior if zk and/or kafka cluster goes away...
Is there a configuration doc page for 0.8 (since apparently there are some
new settings)?

Jason

On Tue, Nov 20, 2012 at 10:39 AM, Jun Rao <[EMAIL PROTECTED]> wrote:

> That's right. VIP is only used for getting metadata. All producer send
> requests are through direct RPC to each broker.
>
> Thanks,
>
> Jun
>
> On Tue, Nov 20, 2012 at 10:28 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> wrote:
>
> > Ok,
> >
> > I think I understand (so I'll need to change some things in our set up to
> > work with 0.8).
> >
> > So the VIP is only for getting meta-data?  After that, under the covers,
> > the producers will make direct connections to individual kafka hosts that
> > they learned about from connecting through the VIP?
> >
> > Jason
> >
> > On Tue, Nov 20, 2012 at 10:20 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> >
> > > I think the confusion is that we are answering a slightly different
> > > question then what you are asking. If I understand you are asking, "do
> I
> > > need to put ALL the kafka broker urls into the config for the client
> and
> > > will this need to be updated if I add machines to the cluster?".
> > >
> > > The answer to both these questions is no. The broker list configuration
> > > will work exactly as your zookeeper configuration worked. Namely you
> must
> > > have the URL of at least one operational broker in the cluster, and the
> > > producer will use this/these urls to fetch a complete topology of the
> > > cluster (all nodes, and what partitions they have). If you add kafka
> > > brokers or migrate partitions from one broker to another clients will
> > > automatically discover this and adjust appropriately with no need for
> > > config changes. The brokerlist you give is only used when fetching
> > > metadata, all producer requests go directly to the appropriate broker.
> > As a
> > > result you can use a VIP for the broker list if you like, without
> having
> > > any of the actual data you send go through that VIP.
> > >
> > > As Neha and Jun mentioned there were a couple of reasons for this
> change:
> > > 1. If you use kafka heavily everything ends up connecting to zk and any
> > > operational change to zk or upgrade because immensely difficult.
> > > 2. Zk support outside java is spotty at best.
> > > 3. In effect we were using zk for what it is good at--discover--because
> > > discovery is asynchronous. That is if you try to send to the wrong
> broker
> > > we need to give you an error right away and have you update your
> > metadata,
> > > and this will likely happen before the zk watcher fires. Plus once you
> > > handle this case you don't need the watcher. As a result zk is just
> being
> > > used as a key-value store.
> > >
> > > -Jay
> > >
> > >
> > >
> > > On Tue, Nov 20, 2012 at 9:44 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Ok,
> > > >
> > > > So, I'm still wrapping my mind around this.  I liked being able to
> use
> > zk
> > > > for all clients, since it made it very easy to think about how to
> > update
> > > > the kafka cluster.  E.g. how to add new brokers, how to move them all
> > to
> > > > new hosts entirely, etc., without having to redeploy all the clients.
> > >  The
> > > > new brokers will simply advertise their new location via zk, and all
> > > > clients will pick it up.
> > > >
> > > > By requiring use of a configured broker.list for each client, means
> > that
> > > > 1000's of deployed apps need to be updated any time the kafka cluster
> > > > changes, no?  (Or am I not understanding?).
> > > >
> > > > You mention that auto-discovery of new brokers will still work, is
> that
> > > > dependent on the existing configured broker.list set still being
> > > available
> > > > also?
> > > >
> > > > I can see though how this will greatly reduce the load on zookeeper.
> > > >
> > > > Jason
> > > >
> > > >
> > > >
> > > > On Tue, Nov 20, 2012 at 9:03 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > Jason,
> > > > >
> > > > > Auto discovery of new brokers and rolling restart of brokers are
+
Neha Narkhede 2012-11-20, 19:18