Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Flume-ng - Distributed

Juan Gentile 2012-10-09, 18:03
Copy link to this message
RE: Flume-ng - Distributed
You would run a flume-ng instance on each node with an avro-sink.  Then on your collector machine you will run another flume-ng instance with an avro-collector.

If you run more than one collector you can setup sink groups and define that it does failover or load balancing.

The concept of a flume master from flume 0.9.x does not exist on flume-ng.  I personally use the node and collector configs in the same config file under a different agent name, and then keep them synced on all machines.

These two docs are pretty helpful:


From: Juan Gentile [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, October 09, 2012 11:04 AM
Subject: Flume-ng - Distributed


I'm new to Flume-ng, I'd like to ask you if you can tell me how I can accomplish to have an agent distributed in a cluster. I've have developed my own source and sink version that reads from a queue and the sink stores the messages read to hdfs. If I want to have this running in multiple instances, do I have to submit it on each node?

This is my conf file:
agent1.channels.channel1.type = memory
agent1.channels.channel1.capacity = 1000
agent1.channels.channel1.transactionCapacity = 1000

agent1.sources.source1.channels = channel1
agent1.sources.source1.type = MySource

agent1.sinks.sink1.channel = channel1
agent1.sinks.sink1.type = MySink

agent1.channels = channel1
agent1.sources = source1
agent1.sinks = sink1
I see that there is the concept of 'master' a 'node' in the previous version of flume, do I have something similar here?

Mike Percy 2012-10-10, 04:51
Juan Gentile 2012-10-10, 16:54
Camp, Roy 2012-10-10, 18:19
Hari Shreedharan 2012-10-10, 18:30
Harish Mandala 2012-10-10, 19:45
Juan Gentile 2012-10-10, 19:49
iain wright 2012-10-10, 19:55