Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> async producer behavior if zk and/or kafka cluster goes away...


Copy link to this message
-
Re: async producer behavior if zk and/or kafka cluster goes away...
Is there a configuration doc page for 0.8 (since apparently there are some
new settings)?

Jason

On Tue, Nov 20, 2012 at 10:39 AM, Jun Rao <[EMAIL PROTECTED]> wrote:

> That's right. VIP is only used for getting metadata. All producer send
> requests are through direct RPC to each broker.
>
> Thanks,
>
> Jun
>
> On Tue, Nov 20, 2012 at 10:28 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> wrote:
>
> > Ok,
> >
> > I think I understand (so I'll need to change some things in our set up to
> > work with 0.8).
> >
> > So the VIP is only for getting meta-data?  After that, under the covers,
> > the producers will make direct connections to individual kafka hosts that
> > they learned about from connecting through the VIP?
> >
> > Jason
> >
> > On Tue, Nov 20, 2012 at 10:20 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
> >
> > > I think the confusion is that we are answering a slightly different
> > > question then what you are asking. If I understand you are asking, "do
> I
> > > need to put ALL the kafka broker urls into the config for the client
> and
> > > will this need to be updated if I add machines to the cluster?".
> > >
> > > The answer to both these questions is no. The broker list configuration
> > > will work exactly as your zookeeper configuration worked. Namely you
> must
> > > have the URL of at least one operational broker in the cluster, and the
> > > producer will use this/these urls to fetch a complete topology of the
> > > cluster (all nodes, and what partitions they have). If you add kafka
> > > brokers or migrate partitions from one broker to another clients will
> > > automatically discover this and adjust appropriately with no need for
> > > config changes. The brokerlist you give is only used when fetching
> > > metadata, all producer requests go directly to the appropriate broker.
> > As a
> > > result you can use a VIP for the broker list if you like, without
> having
> > > any of the actual data you send go through that VIP.
> > >
> > > As Neha and Jun mentioned there were a couple of reasons for this
> change:
> > > 1. If you use kafka heavily everything ends up connecting to zk and any
> > > operational change to zk or upgrade because immensely difficult.
> > > 2. Zk support outside java is spotty at best.
> > > 3. In effect we were using zk for what it is good at--discover--because
> > > discovery is asynchronous. That is if you try to send to the wrong
> broker
> > > we need to give you an error right away and have you update your
> > metadata,
> > > and this will likely happen before the zk watcher fires. Plus once you
> > > handle this case you don't need the watcher. As a result zk is just
> being
> > > used as a key-value store.
> > >
> > > -Jay
> > >
> > >
> > >
> > > On Tue, Nov 20, 2012 at 9:44 AM, Jason Rosenberg <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Ok,
> > > >
> > > > So, I'm still wrapping my mind around this.  I liked being able to
> use
> > zk
> > > > for all clients, since it made it very easy to think about how to
> > update
> > > > the kafka cluster.  E.g. how to add new brokers, how to move them all
> > to
> > > > new hosts entirely, etc., without having to redeploy all the clients.
> > >  The
> > > > new brokers will simply advertise their new location via zk, and all
> > > > clients will pick it up.
> > > >
> > > > By requiring use of a configured broker.list for each client, means
> > that
> > > > 1000's of deployed apps need to be updated any time the kafka cluster
> > > > changes, no?  (Or am I not understanding?).
> > > >
> > > > You mention that auto-discovery of new brokers will still work, is
> that
> > > > dependent on the existing configured broker.list set still being
> > > available
> > > > also?
> > > >
> > > > I can see though how this will greatly reduce the load on zookeeper.
> > > >
> > > > Jason
> > > >
> > > >
> > > >
> > > > On Tue, Nov 20, 2012 at 9:03 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > Jason,
> > > > >
> > > > > Auto discovery of new brokers and rolling restart of brokers are
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB