Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka Monitoring


Copy link to this message
-
Kafka Monitoring
Good evening. I have read through section of monitoring. I tried to map
each section to corresponding JMX attribute. I will appreciate if you
answer a few questions bellow.

Thanks so much in advance,
Vadim

    What this JMX
"kafka.controller":type="KafkaController",name="ActiveControllerCount" for?

    The rate of data in and out of the cluster and the number of messages
written
   Which jmx attributes should I monitor? Since I should alert on this What
are acceptable changes? What are not?
    The log flush rate and the time taken to flush the log
    "kafka.log":type="LogFlushStats",name="LogFlushRateAndTimeMs"
Which attribute I should be watching and what acceptable deviation change
before I should alert
    The number of partitions that have replicas that are down or have
fallen behind and are underreplicated.
   Is this the JMX
"kafka.cluster":type="Partition",name="buypets-0-UnderReplicated" that will
show replicas that are down?

    Unclean leader elections. This shouldn't happen.

 "kafka.controller":type="ControllerStats",name="UncleanLeaderElectionsPerSec".
I assume that should always be 0 and if its not 0 we have problem.
    Number of partitions each node is the leader for.
   Which JMX attribute(s) monitors this?
    Leader elections: we track each time this happens and how long it took:

"kafka.controller":type="ControllerStats",name="LeaderElectionRateAndTimeMs"
    Any changes to the ISR
    Which JMX attribute I should monitor for this? Should I alert on this?
What are reasonable changes? Which are not?
    The number of produce requests waiting on replication to report back
   Which JMX attribute I should monitor for this? Should I alert on this?
What are reasonable changes? Which are not?
    The number of fetch requests waiting on data to arrive
   Which JMX attribute I should monitor for this? Should I alert on this?
What are reasonable changes? Which are not?

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB