Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - LoadBalancing Sink Processor question

Copy link to this message
Re: LoadBalancing Sink Processor question
JR 2013-04-01, 02:12
Hi Paul,

   I apologize that I am not giving you a solution, but in turn have a
question about your avro sink to tier2 avro src.

   Could you please share the conf file?  I have tried to put the sink and
source as follows, but I still get RPC connection failed.

If you have had success, could you please tell me how you got yours to work?

What is the command like / shell scripts you wrote to connect the tier1-->
tier2 --> HDFS?

Avro source ---> mem Channel ----> Avro sink --> (next node) avro source
--> mem channel ---> hdfs sink

#agent1 on  node1
 agent1.sources = avroSource
 agent1.channels = ch1
 agent1.sinks = avroSink

#agent2 on node2
 agent2.sources = avroSource2
 agent2.channels = ch2
 agent2.sinks = hdfsSink

# first source - avro
type = avro
 agent1.sources.avroSource.bind =
 agent1.sources.avroSource.port = 41414
 agent1.sources.avroSource.channels = ch1

# first sink - avro
 agent1.sinks.avroSink.type = avro
 agent1.sinks.avroSink.hostname =
 agent1.sinks.avroSink.port = 41415
 agent1.sinks.avroSink.channel = ch1

# second source - avro
 agent2.sources.avroSource2.type = avro
 agent2.sources.avroSource2.bind = node2 ip
 agent2.sources.avroSource2.port = 41415
 agent2.sources.avroSource2.channel = ch2

# second sink - hdfs
 agent2.sinks.hdfsSink.type = hdfs
 agent2.sinks.hdfsSink.channel = ch2
agent2.sinks.hdfsSink.hdfs.writeFormat = Text
 agent2.sinks.hdfsSink.hdfs.filePrefix =  testing
 agent2.sinks.hdfsSink.hdfs.path = hdfs://node2:9000/flume/

# channels
 agent1.channels.ch1.type = memory
 agent1.channels.ch1.capacity = 1000
 agent2.channels.ch2.type = memory
 agent2.channels.ch2.capacity = 1000
Am getting errors with the ports. Could someone please check if I have
connected the sink in node1 to source in node 2 properly?

13/03/24 04:32:16 INFO source.AvroSource: Starting Avro source avroSource:
{ bindAddress:, port: 41414 }...
13/03/24 04:32:16 INFO instrumentation.
MonitoredCounterGroup: Monitoried counter group for type: SINK, name:
avroSink, registered successfully.
13/03/24 04:32:16 INFO instrumentation.MonitoredCounterGroup: Component
type: SINK, name: avroSink started
13/03/24 04:32:16 INFO sink.AvroSink: Avro sink avroSink: Building
RpcClient with hostname:, port: 41415
13/03/24 04:32:16 WARN sink.AvroSink: Unable to create avro client using
hostname:, port: 41415
org.apache.flume.FlumeException: NettyAvroRpcClient { host:, port:
41415 }: RPC connection error
        at org.apache.flume.sink.AvroSink.start(AvroSink.java:242)
        at org.apache.flume.SinkRunner.start(SinkRunner.java:79)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:161)

On Fri, Mar 29, 2013 at 4:20 PM, Paul Chavez <

> **
> I am curious about the observed behavior of a set of agents configured
> with a Load Balancing sink processor.
> I have 4 'tier1' agents receiving events directly from app servers that
> feed into 2 'tier2' agents that write to HDFS. They are connected up via