Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Failover Processor + Load Balanced Processor?


Copy link to this message
-
Re: Failover Processor + Load Balanced Processor?
Understood.  Thanks so much for your time!

On Fri, Aug 17, 2012 at 12:19 PM, Arvind Prabhakar <[EMAIL PROTECTED]>wrote:

> On Fri, Aug 17, 2012 at 10:11 AM, Chris Neal <[EMAIL PROTECTED]> wrote:
>
>> Thanks Arvind,
>>
>> So in the load balanced scenario, if sink A goes down, and events go all
>> to sink B, does sink A's status ever get re-checked to be added back to the
>> pool?  Or once it's down, it's down?
>>
>
> It does get added back for the subsequent invocations. The Failover sink
> processor on the other hand has a back-off semantic which
> will exponentially increase the waiting period before a sink is retried.
>
> Regards,
> Arvind Prabhakar
>
>
>>
>> Chris
>>
>> On Fri, Aug 17, 2012 at 12:01 PM, Arvind Prabhakar <[EMAIL PROTECTED]>wrote:
>>
>>> Hi,
>>>
>>> FYI - the load balancing sink processor does support simple failover
>>> semantics. The way it works is that if a sink is down, it will proceed to
>>> the next sink in the group until all sinks are exhausted. The failover sink
>>> processor on the other hand does complex failure handling and back-off such
>>> as blacklisting sinks that repeatedly fail etc. The issue [1] tracks
>>> enhancing this processor to support backoff semantics.
>>>
>>> The one issue with your configuration that I could spot by a quick
>>> glance is that you are adding your active sinks to both the sink groups.
>>> This does not really work and the configuration subsystem simply flags the
>>> second inclusion as a problem and ignores it. By design, a sink can either
>>> be on its own or in one explicit sink group.
>>>
>>> [1] https://issues.apache.org/jira/browse/FLUME-1488
>>>
>>> Regards,
>>> Arvind Prabhakar
>>>
>>> On Fri, Aug 17, 2012 at 8:59 AM, Chris Neal <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi all.
>>>>
>>>> The User Guide talks about the various types of Sink Processors, but
>>>> doesn't say whether they can be aggregated together.  A Failover Processor
>>>> that moves between 1..n sinks is great, as is a Load Balancer Processor
>>>> that moves between 1..n sinks, but what is the best would be an agent that
>>>> can utilize both a Failover Processor AND a Load Balancer Processor!
>>>>
>>>> I've created a configuration which I believe supports this, and the
>>>> Agent starts up and processes events, but I wanted to ping this group to
>>>> make sure that this configuration is really doing what I think it is doing
>>>> behind the scenes.
>>>>
>>>> Comments?
>>>>
>>>> # Define the sources, sinks, and channels for the agent
>>>> agent.sources = avro-instance_1-source avro-instance_2-source
>>>> agent.channels = memory-agent-channel
>>>> agent.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
>>>> agent.sinkgroups = failover-sink-group lb-sink-group
>>>>
>>>> # Bind sources to channels
>>>> agent.sources.avro-instance_1-source.channels = memory-agent-channel
>>>> agent.sources.avro-instance_2-source.channels = memory-agent-channel
>>>>
>>>> # Define sink group for failover
>>>> agent.sinkgroups.failover-sink-group.sinks = avro-hdfs_1-sink
>>>> avro-hdfs_2-sink
>>>> agent.sinkgroups.failover-sink-group.processor.type = failover
>>>> agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_1-sink
>>>> = 5
>>>> agent.sinkgroups.failover-sink-group.processor.priority.avro-hdfs_2-sink
>>>> = 10
>>>> agent.sinkgroups.failover-sink-group.processor.maxpenalty = 10000
>>>>
>>>> # Define sink group for load balancing
>>>> agent.sinkgroups = lb-sink-group
>>>> agent.sinkgroups.group1.sinks = avro-hdfs_1-sink avro-hdfs_2-sink
>>>> agent.sinkgroups.group1.processor.type = load_balance
>>>> agent.sinkgroups.group1.processor.selector = round_robin
>>>>
>>>> # Bind sinks to channels
>>>> agent.sinks.avro-hdfs_1-sink.channel = memory-agent-channel
>>>> agent.sinks.avro-hdfs_2-sink.channel = memory-agent-channel
>>>>
>>>> # avro-instance_1-source properties
>>>> agent.sources.avro-instance_1-source.type = exec
>>>> agent.sources.avro-instance_1-source.command = tail -F
>>>> /somedir/Trans.log
>>>> agent.sources.avro-instance_1-source.restart = true