Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Failover Processor + Load Balanced Processor?


Copy link to this message
-
Re: Failover Processor + Load Balanced Processor?
Arvind Prabhakar 2012-08-17, 17:01
Hi,

FYI - the load balancing sink processor does support simple failover
semantics. The way it works is that if a sink is down, it will proceed to
the next sink in the group until all sinks are exhausted. The failover sink
processor on the other hand does complex failure handling and back-off such
as blacklisting sinks that repeatedly fail etc. The issue [1] tracks
enhancing this processor to support backoff semantics.

The one issue with your configuration that I could spot by a quick glance
is that you are adding your active sinks to both the sink groups. This does
not really work and the configuration subsystem simply flags the second
inclusion as a problem and ignores it. By design, a sink can either be on
its own or in one explicit sink group.

[1] https://issues.apache.org/jira/browse/FLUME-1488

Regards,
Arvind Prabhakar

On Fri, Aug 17, 2012 at 8:59 AM, Chris Neal <[EMAIL PROTECTED]> wrote:

> Hi all.
>
> The User Guide talks about the various types of Sink Processors, but
> doesn't say whether they can be aggregated together.  A Failover Processor
> that moves between 1..n sinks is great, as is a Load Balancer Processor
> that moves between 1..n sinks, but what is the best would be an agent that
> can utilize both a Failover Processor AND a Load Balancer Processor!
>
> I've created a configuration which I believe supports this, and the Agent
> starts up and processes events, but I wanted to ping this group to make
> sure that this configuration is really doing what I think it is doing
> behind the scenes.
>
> Comments?
>
> # Define the sources, sinks, and channels for the agent
> agent.sources = avro-instance_1-source avro-instance_2-source
> agent.channels = memory-agent-channel
> agent.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
> agent.sinkgroups = failover-sink-group lb-sink-group
>
> # Bind sources to channels
> agent.sources.avro-instance_1-source.channels = memory-agent-channel
> agent.sources.avro-instance_2-source.channels = memory-agent-channel
>
> # Define sink group for failover
> agent.sinkgroups.failover-sink-group.sinks = avro-hdfs_1-sink
> avro-hdfs_2-sink
> agent.sinkgroups.failover-sink-group.processor.type = failover
> agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_1-sink > 5
> agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_2-sink > 10
> agent.sinkgroups.failover-sink-group.processor.maxpenalty = 10000
>
> # Define sink group for load balancing
> agent.sinkgroups = lb-sink-group
> agent.sinkgroups.group1.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
> agent.sinkgroups.group1.processor.type = load_balance
> agent.sinkgroups.group1.processor.selector = round_robin
>
> # Bind sinks to channels
> agent.sinks.avro-hdfs_1-sink.channel = memory-agent-channel
> agent.sinks.avro-hdfs_2-sink.channel = memory-agent-channel
>
> # avro-instance_1-source properties
> agent.sources.avro-instance_1-source.type = exec
> agent.sources.avro-instance_1-source.command = tail -F /somedir/Trans.log
> agent.sources.avro-instance_1-source.restart = true
> agent.sources.avro-instance_1-source.batchSize = 100
>
> # avro-instance_2-source properties
> agent.sources.avro-instance_2-source.type = exec
> agent.sources.avro-instance_2-source.command = tail -F
> /somedir/UDXMLTrans.log
> agent.sources.avro-instance_2-source.restart = true
> agent.sources.avro-instance_2-source.batchSize = 100
>
> # avro-hdfs_1-sink properties
> agent.sinks.avro-hdfs_1-sink.type = avro
> agent.sinks.avro-hdfs_1-sink.hostname = hdfshost1.domin.com
> agent.sinks.avro-hdfs_1-sink.port = 10000
>
> # avro-hdfs_2-sink properties
> agent.sinks.avro-hdfs_2-sink.type = avro
> agent.sinks.avro-hdfs_2-sink.hostname = hdfshost2.domain.com
>  agent.sinks.avro-hdfs_2-sink.port = 10000
>
> # memory-agent-channel properties
> agent.channels.memory-agent-channel.type = memory
> agent.channels.memory-agent-channel.capacity = 20000
> agent.channels.memory-agent-channel.transactionCapacity = 100
>
> Thanks!
>