Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Hadoop and Ganglia 3.1 - not seeing any hadoop metrics


Copy link to this message
-
Hadoop and Ganglia 3.1 - not seeing any hadoop metrics
Hi,

I'm having trouble setting up Hadoop 0.20.2 with Ganglia 3.1.  Ganglia is
running, and I am getting standard metrics, but I am not seeing any of the
Hadoop metrics.  BTW, I'm running this in EC2.

I applied the GangliaContext31 patch, HADOOP-4675.  I believe the patch is
working, as I DO NOT SEE any ClassNotFoundExceptions in the hadoop logs.

CONFIG

$ cat conf/hadoop-metrics.properties

dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
dfs.period=10
dfs.servers=ip-***.ec2.internal:8649
...
[same for mapred.*, jvm.*, rpc.*]
...
channels from /etc/ganglia/gond.conf (not using multicast)

udp_send_channel {
  host=ip-***.ec2.internal
  port = 8649
  ttl = 1
}

udp_recv_channel {
  port = 8649
}

tcp_accept_channel {
  port = 8649
}
gmond.conf only has standard collections groups,.  I.e. as far as I know, I
don't need to add an extra Collection Group for the Hadoop metrics.  Right?

I found quite a few posts on the net with similar problem (not seeing the
hadoop metrics), and I've tried all the hints that I could find in those
threads.

1. I tested the FileContext in hadoop-metrics, and it seems to work.  I get
lots of data dumped to that file.

2. I tested with telnet to the server and port specified in
hadoop-metrics.properties.  That returns a big block of XML.

$ telnet ip-****.ec2.internal 8649
Trying ****...
Connected to ip-***.ec2.internal.
Escape character is '^]'.
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<!DOCTYPE GANGLIA_XML [
...
3. I tried pushing a dummy metric with gmetric.  This fake metric does come
up in the ganglia webfrontend.

gmetric --name=hello --value=10 --type=int8

Any suggestions about what else I can look into?  I don't need a Collection
Group for the hadoop metrics in gmond.conf, right?  Any better ways to
isolate the problem to gmond, gmetad or hadoop?

Thanks for any help or suggestions,
Marc