Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # dev - Metrics in new producer


+
Jay Kreps 2014-02-06, 20:51
Copy link to this message
-
Re: Metrics in new producer
Jay Kreps 2014-02-06, 21:51
Also, here is the javadoc for this package:
http://empathybox.com/kafka-metrics-javadoc/index.html?kafka/common/metrics/package-summary.html
On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> Hey guys,
>
> I wanted to kick off a quick discussion of metrics with respect to the new
> producer and consumer (and potentially the server).
>
> At a high level I think there are three approaches we could take:
> 1. Plain vanilla JMX
> 2. Use Coda Hale (AKA Yammer) Metrics
> 3. Do our own metrics (with JMX as one output)
>
> 1. Has the advantage that JMX is the most commonly used java thing and
> plugs in reasonably to most metrics systems. JMX is included in the JDK so
> it doesn't impose any additional dependencies on clients. It has the
> disadvantage that plain vanilla JMX is a pain to use. We would need a bunch
> of helper code for maintaining counters to make this reasonable.
>
> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX
> output as well as direct output to many other types of systems. The primary
> downside we have had with Coda Hale has to do with the clients and library
> incompatibilities. We are currently on an older more popular version. The
> newer version is a rewrite of the APIs and is incompatible. Originally
> these were totally incompatible and people had to choose one or the other.
> I think that has been improved so now the new version is a totally
> different package. But even in this case you end up with both versions if
> you use Kafka and we are on a different version than you which is going to
> be pretty inconvenient.
>
> 3. Doing our own has the downside of potentially reinventing the wheel,
> and potentially needing to work out any bugs in our code. The upsides would
> depend on the how good the reinvention was. As it happens I did a quick
> (~900 loc) version of a metrics library that is under kafka.common.metrics.
> I think it has some advantages over the Yammer metrics package for our
> usage beyond just not causing incompatibilities. I will describe this code
> so we can discuss the pros and cons. Although I favor this approach I have
> no emotional attachment and wouldn't be too sad if I ended up deleting it.
> Here are javadocs for this code, though I haven't written much
> documentation yet since I might end up deleting it:
>
> Here is a quick overview of this library.
>
> There are three main public interfaces:
>   Metrics - This is a repository of metrics being tracked.
>   Metric - A single, named numerical value being measured (i.e. a counter).
>   Sensor - This is a thing that records values and updates zero or more
> metrics
>
> So let's say we want to track three values about message sizes;
> specifically say we want to record the average, the maximum, the total rate
> of bytes being sent, and a count of messages. Then we would do something
> like this:
>
>    // setup code
>    Metrics metrics = new Metrics(); // this is a global "singleton"
>    Sensor sensor = metrics.sensor("kafka.producer.message.sizes");
>    sensor.add("kafka.producer.message-size.avg", new Avg());
>    sensor.add("kafka.producer.message-size.max", new Max());
>    sensor.add("kafka.producer.bytes-sent-per-sec", new Rate());
>    sensor.add("kafka.producer.message-count", new Count());
>
>    // now when we get a message we do this
>    sensor.record(messageSize);
>
> The above code creates the global metrics repository, creates a single
> Sensor, and defines 5 named metrics that are updated by that Sensor.
>
> Like Yammer Metrics (YM) I allow you to plug in "reporters", including a
> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have keys
> off the metric names not the Sensor names, which I think is an
> improvement--I just use the convention that the last portion of the name is
> the attribute name, the second to last is the mbean name, and the rest is
> the package. So in the above example there is a producer mbean that has a
> avg and max attribute and a producer mbean that has a bytes-sent-per-sec

 
+
Neha Narkhede 2014-02-07, 18:30
+
Clark Breyman 2014-02-06, 21:41
+
Jay Kreps 2014-02-12, 21:22
+
Jay Kreps 2014-02-12, 21:06
+
Jun Rao 2014-02-13, 16:24
+
Jay Kreps 2014-02-12, 22:51
+
Joel Koshy 2014-02-12, 23:54
+
S Ahmed 2014-02-13, 16:10
+
Jay Kreps 2014-02-13, 16:38
+
Joe Stein 2014-02-13, 22:39
+
Clark Breyman 2014-02-13, 23:34
+
Jay Kreps 2014-02-22, 01:06
+
Martin Kleppmann 2014-02-22, 17:26
+
Clark Breyman 2014-02-22, 17:54
+
Jun Rao 2014-02-25, 05:39
+
Jay Kreps 2014-02-22, 18:54
+
Martin Kleppmann 2014-02-24, 17:56
+
Jay Kreps 2014-02-24, 18:11