Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Failover Processor + Load Balanced Processor?

Copy link to this message
Failover Processor + Load Balanced Processor?
Chris Neal 2012-08-17, 15:59
Hi all.

The User Guide talks about the various types of Sink Processors, but
doesn't say whether they can be aggregated together.  A Failover Processor
that moves between 1..n sinks is great, as is a Load Balancer Processor
that moves between 1..n sinks, but what is the best would be an agent that
can utilize both a Failover Processor AND a Load Balancer Processor!

I've created a configuration which I believe supports this, and the Agent
starts up and processes events, but I wanted to ping this group to make
sure that this configuration is really doing what I think it is doing
behind the scenes.


# Define the sources, sinks, and channels for the agent
agent.sources = avro-instance_1-source avro-instance_2-source
agent.channels = memory-agent-channel
agent.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
agent.sinkgroups = failover-sink-group lb-sink-group

# Bind sources to channels
agent.sources.avro-instance_1-source.channels = memory-agent-channel
agent.sources.avro-instance_2-source.channels = memory-agent-channel

# Define sink group for failover
agent.sinkgroups.failover-sink-group.sinks = avro-hdfs_1-sink
agent.sinkgroups.failover-sink-group.processor.type = failover
agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_1-sink = 5
agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_2-sink 10
agent.sinkgroups.failover-sink-group.processor.maxpenalty = 10000

# Define sink group for load balancing
agent.sinkgroups = lb-sink-group
agent.sinkgroups.group1.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
agent.sinkgroups.group1.processor.type = load_balance
agent.sinkgroups.group1.processor.selector = round_robin

# Bind sinks to channels
agent.sinks.avro-hdfs_1-sink.channel = memory-agent-channel
agent.sinks.avro-hdfs_2-sink.channel = memory-agent-channel

# avro-instance_1-source properties
agent.sources.avro-instance_1-source.type = exec
agent.sources.avro-instance_1-source.command = tail -F /somedir/Trans.log
agent.sources.avro-instance_1-source.restart = true
agent.sources.avro-instance_1-source.batchSize = 100

# avro-instance_2-source properties
agent.sources.avro-instance_2-source.type = exec
agent.sources.avro-instance_2-source.command = tail -F
agent.sources.avro-instance_2-source.restart = true
agent.sources.avro-instance_2-source.batchSize = 100

# avro-hdfs_1-sink properties
agent.sinks.avro-hdfs_1-sink.type = avro
agent.sinks.avro-hdfs_1-sink.hostname = hdfshost1.domin.com
agent.sinks.avro-hdfs_1-sink.port = 10000

# avro-hdfs_2-sink properties
agent.sinks.avro-hdfs_2-sink.type = avro
agent.sinks.avro-hdfs_2-sink.hostname = hdfshost2.domain.com
agent.sinks.avro-hdfs_2-sink.port = 10000

# memory-agent-channel properties
agent.channels.memory-agent-channel.type = memory
agent.channels.memory-agent-channel.capacity = 20000
agent.channels.memory-agent-channel.transactionCapacity = 100