We do the same thing, but with Collectd as our graphing/collection mechanism. I am actually going to do a blog post in the next day or two with the code to our flume data collection script, and some example graphs/etc. We've done a similar thing with Zookeeper monitoring (http://engblog.nextdoor.com/post/49942956311/apache-zookeeper-performance-monitoring).
On May 15, 2013, at 9:36 AM, Paul Chavez <[EMAIL PROTECTED]> wrote:
> There are a few ways to monitor flume in operation. We use the JSON reporting, which is available via 'http://<agent address>:<port>/metrics'. You need to start the agent with the following parameters to get this interface:
> -Dflume.monitoring.type=http -Dflume.monitoring.port=34545
> We use cacti to graph channel size both as a percentage of maximum and absolute number of events in channel. This provides warning if the sinks cannot keep up with the sources.
> We also graph ingress/egress event counts, much like a network bandwidth graph, for some channels to get an idea of the throughput and to see if sources/sinks are running at same speed.
> From: liuyongbo [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, May 14, 2013 10:38 PM
> To: [EMAIL PROTECTED]
> Subject: how to print the channel capacity
> I’m using flume to pass log data to mongodb, but I find that some data lose when the pressure is in high level, so I want to know the max request that flume can hold and need to print the capacity.but I can not find the proper way to do this instead of change the source code. Any ideas?