|
|
-
Hadoop and Ganglia 3.1 - not seeing any hadoop metricsMarc Limotte 2011-11-03, 16:21
Hi,
I'm having trouble setting up Hadoop 0.20.2 with Ganglia 3.1. Ganglia is running, and I am getting standard metrics, but I am not seeing any of the Hadoop metrics. BTW, I'm running this in EC2. I applied the GangliaContext31 patch, HADOOP-4675. I believe the patch is working, as I DO NOT SEE any ClassNotFoundExceptions in the hadoop logs. CONFIG $ cat conf/hadoop-metrics.properties dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 dfs.period=10 dfs.servers=ip-***.ec2.internal:8649 ... [same for mapred.*, jvm.*, rpc.*] ... channels from /etc/ganglia/gond.conf (not using multicast) udp_send_channel { host=ip-***.ec2.internal port = 8649 ttl = 1 } udp_recv_channel { port = 8649 } tcp_accept_channel { port = 8649 } gmond.conf only has standard collections groups,. I.e. as far as I know, I don't need to add an extra Collection Group for the Hadoop metrics. Right? I found quite a few posts on the net with similar problem (not seeing the hadoop metrics), and I've tried all the hints that I could find in those threads. 1. I tested the FileContext in hadoop-metrics, and it seems to work. I get lots of data dumped to that file. 2. I tested with telnet to the server and port specified in hadoop-metrics.properties. That returns a big block of XML. $ telnet ip-****.ec2.internal 8649 Trying ****... Connected to ip-***.ec2.internal. Escape character is '^]'. <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?> <!DOCTYPE GANGLIA_XML [ ... 3. I tried pushing a dummy metric with gmetric. This fake metric does come up in the ganglia webfrontend. gmetric --name=hello --value=10 --type=int8 Any suggestions about what else I can look into? I don't need a Collection Group for the hadoop metrics in gmond.conf, right? Any better ways to isolate the problem to gmond, gmetad or hadoop? Thanks for any help or suggestions, Marc |