Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - AvroSink and LoadBalancingRpcClient


Copy link to this message
-
Re: AvroSink and LoadBalancingRpcClient
Connor Woodson 2013-01-10, 19:33
To create an RpcClient you pass in to the RpcClientFactory a set of
user-defined properties and are returned an instance of an RpcClient. This
client can be either the default NettyAvroRpcClient which is a basic 1-1
connection, LoadBalancingRpcClient, or FailoverRpcClient; you should never
deal directly with those classes.

You can make the AvroCLIClient use the LoadBalancer or Failover clients by
passing in via the command line "-P some.file" and that file will then
contain the list of properties to be given to the RpcClientFactory (look at
RpcClientConfigurationConstants to see the different properties; different
clients use different ones) and so you can specify as one of the properties
what type of client you wish to use.

You can also use the RpcClient/RpcClientFactory directly in your own code,
either to make your own version of the EmbeddedAgent that doesn't need a
channel or to create an advanced version of the AvroSink that is capable of
choosing between the different RpcClients. Another good use is modifying
the Log4jAppender to allow the appender to connect to multiple hosts
through either LoadBalancing or Failover (I think this is a JIRA issue as
well).

- Connor
On Thu, Jan 10, 2013 at 1:58 AM, Denny Ye <[EMAIL PROTECTED]> wrote:

> Thanks, it runs well.
> One question : where should we use LoadBalancingRpcClient, just
> AvroCLIClient?
>
>
> 2013/1/10 Hari Shreedharan <[EMAIL PROTECTED]>
>
>>  +1 - using sink groups with load balancing sink processor is the
>> solution. backoff is optional (only if you want failed sinks to be not
>> tried for a while).
>>
>>
>> Hari
>>
>> --
>> Hari Shreedharan
>>
>> On Thursday, January 10, 2013 at 12:10 AM, Connor Woodson wrote:
>>
>> Forgot about sink processors; yes, it will work.
>>
>> The trick of this method is you will use a different sink for each
>> endpoint, where as the RpcClient (when exposed) will do it all in itself.
>> Your configuration will need to look something like this:
>>
>> -----------------
>>
>> <sources>
>>
>> a1.channels = c1
>> <channel setup>
>>
>> a1.sinks = k1 k2
>>
>> a1.sinks.k1.type = AVRO
>> < set up centralFlumeE connection >
>> a1.sinks.k1.channel = c1
>>
>> a1.sinks.k2.type = AVRO
>> < set up centralFlumeF connection >
>> a1.sinks.k2.channel = c1
>>
>> a1.sinkgroups = g1
>> a1.sinkgroups.g1.sinks = k1 k2
>> a1.sinkgroups.g1.processor.type = load_balance
>> a1.sinkgroups.g1.processor.backoff = true
>> a1.sinkgroups.g1.processor.selector = round_robin
>>
>> -----------------
>>
>> here is the relevant link for the load balancing processor:
>> http://flume.apache.org/FlumeUserGuide.html#load-balancing-sink-processor
>>
>> Remember that all sinks in a sink group must share the same channel. This
>> is load balancing, which is what you are seeking in your scenario; the load
>> balancer is not for failover (in the setup of primary and backup servers),
>> although there is a FailoverSinkProcessor for if that's needed.
>>
>> - Connor
>>
>>
>> On Wed, Jan 9, 2013 at 11:55 PM, Denny Ye <[EMAIL PROTECTED]> wrote:
>>
>> hi Hari,
>>     I cannot judge the situation that using method you raised. I would
>> like to explain my case and need your comments. Thanks a lot!
>>     What I need is load balancing while event transferring.  Assume that
>> I have single local Flume server (located with application) named
>> 'localFlumeA', configured with single AvroSink and Channel. Meanwhile, two
>> central Flume servers (collectors) named 'centralFlumeE' and
>> 'centralFlumeF'. Under this case, I would like to configure load balancing
>> between 'centralFlumeE' and 'centralFlumeF' for events coming from
>> 'localFlumeA', and load can be dispatched averagely for that two central
>> Flume servers.
>>     Can it be configured by LoadBalancingSinkProcessor in your mind? Wish
>> your advice
>>
>> -Regards
>> Denny Ye
>>
>>
>> 2013/1/10 Hari Shreedharan <[EMAIL PROTECTED]>
>>
>>  The LoadBalancing capability similar to the LoadBalancingRpcClient can
>> be configured for multiple Avro Sinks using a LoadBalancingSinkProcessor,