Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka, mail # user - anecdotal uptime and service monitoring


+
S Ahmed 2012-12-28, 15:28
+
Jun Rao 2012-12-28, 22:48
Copy link to this message
-
Re: anecdotal uptime and service monitoring
S Ahmed 2013-01-30, 01:39
Jun,

Great list.   I'm haven't really setup monitoring before, so for starters,
what should I be researching in order to monitor those metrics, are they
exposed via those yammer metrics library that can be exported to a csv
file, or are these jmx related items?
On Fri, Dec 28, 2012 at 5:47 PM, Jun Rao <[EMAIL PROTECTED]> wrote:

> At LinkedIn, the most common failure of a Kafka broker is when we have to
> deploy new Kafka code/config. Otherwise, the broker can be up for a long
> time (e..g, months). It woud be good to monitor the following metrics at
> the broker: log flush time/rate, produce/fetch requests/messages rate, GC
> rate/time, network bandwidth utilization,  and disk space and I/O
> utilization. For the clients, it would be good to monitor message
> size/rate, request time/rate, dropped event rate (for async producers) and
> consumption lag (for consumers). For ZK, ideally, one should monitor ZK
> request latency and GCs.
>
> Thanks,
>
> Jun
>
> On Fri, Dec 28, 2012 at 7:27 AM, S Ahmed <[EMAIL PROTECTED]> wrote:
>
> > Curious what kind of uptime have you guys experienced using kafka?
> >
> > What sort of monitoring do you suggest should be in place for kafka?
> >
> > If the service crashes, does it usually make sense to have something like
> > upstart restart the service?
> >
> > There are allot of moving parts (hard drive space, zooker, producers,
> > consumers, etc.)
> >
> > Also if the consumers can't keep up with new messages...
> >
>

 
+
Jun Rao 2013-01-30, 04:56