Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: Config for new clients (and server)


Copy link to this message
-
Re: Config for new clients (and server)
I'm not so sure about the static config names used in the producer, but I'm
+1 on using the key value approach for configs to ease operability.

Thanks,
Neha
On Wed, Feb 5, 2014 at 10:10 AM, Guozhang Wang <[EMAIL PROTECTED]> wrote:

> +1 for the key-value approach.
>
> Guozhang
>
>
> On Tue, Feb 4, 2014 at 9:34 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:
>
> > We touched on this a bit in previous discussions, but I wanted to draw
> out
> > the approach to config specifically as an item of discussion.
> >
> > The new producer and consumer use a similar key-value config approach as
> > the existing scala clients but have different implementation code to help
> > define these configs. The plan is to use the same approach on the server,
> > once the new clients are complete; so if we agree on this approach it
> will
> > be the new default across the board.
> >
> > Let me split this into two parts. First I will try to motivate the use of
> > key-value pairs as a configuration api. Then let me discuss the mechanics
> > of specifying and parsing these. If we agree on the public api then the
> > public api then the implementation details are interesting as this will
> be
> > shared across producer, consumer, and broker and potentially some tools;
> > but if we disagree about the api then there is no point in discussing the
> > implementation.
> >
> > Let me explain the rationale for this. In a sense a key-value map of
> > configs is the worst possible API to the programmer using the clients.
> Let
> > me contrast the pros and cons versus a POJO and motivate why I think it
> is
> > still superior overall.
> >
> > Pro: An application can externalize the configuration of its kafka
> clients
> > into its own configuration. Whatever config management system the client
> > application is using will likely support key-value pairs, so the client
> > should be able to directly pull whatever configurations are present and
> use
> > them in its client. This means that any configuration the client supports
> > can be added to any application at runtime. With the pojo approach the
> > client application has to expose each pojo getter as some config
> parameter.
> > The result of many applications doing this is that the config is
> different
> > for each and it is very hard to have a standard client config shared
> > across. Moving config into config files allows the usual tooling (version
> > control, review, audit, config deployments separate from code pushes,
> > etc.).
> >
> > Pro: Backwards and forwards compatibility. Provided we stick to our java
> > api many internals can evolve and expose new configs. The application can
> > support both the new and old client by just specifying a config that will
> > be unused in the older version (and of course the reverse--we can remove
> > obsolete configs).
> >
> > Pro: We can use a similar mechanism for both the client and the server.
> > Since most people run the server as a stand-alone process it needs a
> config
> > file.
> >
> > Pro: Systems like Samza that need to ship configs across the network can
> > easily do so as configs have a natural serialized form. This can be done
> > with pojos using java serialization but it is ugly and has bizare failure
> > cases.
> >
> > Con: The IDE gives nice auto-completion for pojos.
> >
> > Con: There are some advantages to javadoc as a documentation mechanism
> for
> > java people.
> >
> > Basically to me this is about operability versus niceness of api and I
> > think operability is more important.
> >
> > Let me now give some details of the config support classes in
> > kafka.common.config and how they are intended to be used.
> >
> > The goal of this code is the following:
> > 1. Make specifying configs, their expected type (string, numbers, lists,
> > etc) simple and declarative
> > 2. Allow for validating simple checks (numeric range checks, etc)
> > 3. Make the config "self-documenting". I.e. we should be able to write
> code
> > that generates the configuration documentation off the config def.

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB