Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # dev >> Metrics in new producer


Copy link to this message
-
Re: Metrics in new producer
Also, here is the javadoc for this package:
http://empathybox.com/kafka-metrics-javadoc/index.html?kafka/common/metrics/package-summary.html
On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> Hey guys,
>
> I wanted to kick off a quick discussion of metrics with respect to the new
> producer and consumer (and potentially the server).
>
> At a high level I think there are three approaches we could take:
> 1. Plain vanilla JMX
> 2. Use Coda Hale (AKA Yammer) Metrics
> 3. Do our own metrics (with JMX as one output)
>
> 1. Has the advantage that JMX is the most commonly used java thing and
> plugs in reasonably to most metrics systems. JMX is included in the JDK so
> it doesn't impose any additional dependencies on clients. It has the
> disadvantage that plain vanilla JMX is a pain to use. We would need a bunch
> of helper code for maintaining counters to make this reasonable.
>
> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> output as well as direct output to many other types of systems. The primary
> downside we have had with Coda Hale has to do with the clients and library
> incompatibilities. We are currently on an older more popular version. The
> newer version is a rewrite of the APIs and is incompatible. Originally
> these were totally incompatible and people had to choose one or the other.
> I think that has been improved so now the new version is a totally
> different package. But even in this case you end up with both versions if
> you use Kafka and we are on a different version than you which is going to
> be pretty inconvenient.
>
> 3. Doing our own has the downside of potentially reinventing the wheel,
> and potentially needing to work out any bugs in our code. The upsides would
> depend on the how good the reinvention was. As it happens I did a quick
> (~900 loc) version of a metrics library that is under kafka.common.metrics.
> I think it has some advantages over the Yammer metrics package for our
> usage beyond just not causing incompatibilities. I will describe this code
> so we can discuss the pros and cons. Although I favor this approach I have
> no emotional attachment and wouldn't be too sad if I ended up deleting it.
> Here are javadocs for this code, though I haven't written much
> documentation yet since I might end up deleting it:
>
> Here is a quick overview of this library.
>
> There are three main public interfaces:
>   Metrics - This is a repository of metrics being tracked.
>   Metric - A single, named numerical value being measured (i.e. a counter).
>   Sensor - This is a thing that records values and updates zero or more
> metrics
>
> So let's say we want to track three values about message sizes;
> specifically say we want to record the average, the maximum, the total rate
> of bytes being sent, and a count of messages. Then we would do something
> like this:
>
>    // setup code
>    Metrics metrics = new Metrics(); // this is a global "singleton"
>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
>    sensor.add("kafka.producer.message-size.avg", new Avg());
>    sensor.add("kafka.producer.message-size.max", new Max());
>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
>    sensor.add("kafka.producer.message-count", new Count());
>
>    // now when we get a message we do this
>    sensor.record(messageSize);
>
> The above code creates the global metrics repository, creates a single
> Sensor, and defines 5 named metrics that are updated by that Sensor.
>
> Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> off the metric names not the Sensor names, which I think is an
> improvement--I just use the convention that the last portion of the name is
> the attribute name, the second to last is the mbean name, and the rest is
> the package. So in the above example there is a producer mbean that has a
> avg and max attribute and a producer mbean that has a bytes-sent-per-sec

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB