Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Failover Processor + Load Balanced Processor?

Copy link to this message
Re: Failover Processor + Load Balanced Processor?
I'm thinking of working on this(adding backoff semantics to the load
balancing processor)

The ticket FLUME-1488 however refers to the load balancing rpc client(or
is it just poorly worded/unclear?). If it is in fact a separate ticket
I'll file one for this

Anyway, I was  interested in hearing thoughts on approach. I'd have
liked to do it within the framework of the LoadBalancingSinkProcessor by
adding a new Selector, however as it is now, it the processor provides
no feedback to the selectors about whether sinks are working or not, so
this can't work.

This leaves two choices: write a new SinkProcessor or modify the
SinkSelector interface to give it a couple of callbacks that the
processor calls to inform the selector of trouble. This shouldn't really
be a problem even if people have written their own selectors so long as
they are extending AbstractSinkSelector which can stub the callbacks.


On 08/18/2012 02:01 AM, Arvind Prabhakar wrote:
> Hi,
> FYI - the load balancing sink processor does support simple failover
> semantics. The way it works is that if a sink is down, it will proceed
> to the next sink in the group until all sinks are exhausted. The
> failover sink processor on the other hand does complex failure
> handling and back-off such as blacklisting sinks that repeatedly fail
> etc. The issue [1] tracks enhancing this processor to support backoff
> semantics.
> The one issue with your configuration that I could spot by a quick
> glance is that you are adding your active sinks to both the sink
> groups. This does not really work and the configuration subsystem
> simply flags the second inclusion as a problem and ignores it. By
> design, a sink can either be on its own or in one explicit sink group.
> [1] https://issues.apache.org/jira/browse/FLUME-1488
> Regards,
> Arvind Prabhakar
> On Fri, Aug 17, 2012 at 8:59 AM, Chris Neal <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>     Hi all.
>     The User Guide talks about the various types of Sink Processors,
>     but doesn't say whether they can be aggregated together.  A
>     Failover Processor that moves between 1..n sinks is great, as is a
>     Load Balancer Processor that moves between 1..n sinks, but what is
>     the best would be an agent that can utilize both a Failover
>     Processor AND a Load Balancer Processor!
>     I've created a configuration which I believe supports this, and
>     the Agent starts up and processes events, but I wanted to ping
>     this group to make sure that this configuration is really doing
>     what I think it is doing behind the scenes.
>     Comments?
>     # Define the sources, sinks, and channels for the agent
>     agent.sources = avro-instance_1-source avro-instance_2-source
>     agent.channels = memory-agent-channel
>     agent.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
>     agent.sinkgroups = failover-sink-group lb-sink-group
>     # Bind sources to channels
>     agent.sources.avro-instance_1-source.channels = memory-agent-channel
>     agent.sources.avro-instance_2-source.channels = memory-agent-channel
>     # Define sink group for failover
>     agent.sinkgroups.failover-sink-group.sinks = avro-hdfs_1-sink
>     avro-hdfs_2-sink
>     agent.sinkgroups.failover-sink-group.processor.type = failover
>     agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_1-sink
>     = 5
>     agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_2-sink
>     = 10
>     agent.sinkgroups.failover-sink-group.processor.maxpenalty = 10000
>     # Define sink group for load balancing
>     agent.sinkgroups = lb-sink-group
>     agent.sinkgroups.group1.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
>     agent.sinkgroups.group1.processor.type = load_balance
>     agent.sinkgroups.group1.processor.selector = round_robin
>     # Bind sinks to channels
>     agent.sinks.avro-hdfs_1-sink.channel = memory-agent-channel
>     agent.sinks.avro-hdfs_2-sink.channel = memory-agent-channel