Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - AvroSink and LoadBalancingRpcClient


Copy link to this message
-
Re: AvroSink and LoadBalancingRpcClient
Connor Woodson 2013-01-10, 07:05
Short answer: there is no way in the current AvroSink to configure the
RpcClient, limiting you to just a single host connection (I'm not sure how
well it recovers if that host goes down).

The AvroSink is incredibly simplified from what the RPCClient can do and
exposes none of the background functionality. Right now, the only way
around that is to create a custom sink based off of the AvroSink source
code and instead of setting the RPCClient up the way it currently is, you
pass into the RPCClient.getInstance() a set of user supplied properties. To
implement this in an unsafe way (not checking any of the user's values)
would only take a couple lines of code I believe. It is a work around, but
it will enable all of the various RPCClient capabilities such as failover
or loadbalancing mode and allow it to connect to multiple hosts.

This is something that (I think) there is a JIRA filed for; but if not, it
would be very helpful for this to be implemented into the actual AvroSink
(and something that should be linked to that is RPCClient.getInstance
accepting a Context object, simply for ease of use).

- Connor
On Wed, Jan 9, 2013 at 10:55 PM, Denny Ye <[EMAIL PROTECTED]> wrote:

> hi all,
>     I didn't find the relationship between AvroSink and other types of
> RpcClient, including LoadBalancingRpcClient. In my opinion, user can set
> the specified RpcClient type from AvroSink with several strategies and host
> selectors. Also, I cannot get information from source code and user guide.
> Did I miss something about this?
>      Wish someone can support, thanks!
>
> -Regards
> Denny Ye
>