Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> LoadBalancing Sink Processor question

Copy link to this message
LoadBalancing Sink Processor question
I am curious about the observed behavior of a set of agents configured with a Load Balancing sink processor.

I have 4 'tier1' agents receiving events directly from app servers that feed into 2 'tier2' agents that write to HDFS. They are connected up via Avro Sink/Sources and a Load Balancing Sink Processor.

Both 'tier2' agents write the same directory and I have observed they occasionally step on each other and one of the tier2 agents at that point 'loses' and gets hung up on a file lease exception. I'm not concerned with that at the moment as I know it's not best practices and this is more of a pilot architecture.

My concern is that once a tier2 agent gets stuck it obviously fills it's channel in time, and then stops accepting put requests from the Avro source. At this point my *expectation* is that the upstream tier1 agents will continue to round-robin to the tier2 nodes with every other 'put' request failing. Assuming the remaining tier2 node can handle the throughput (which it can) I would not expect the tier1 agents to ever fill their channels.

In actuality what happens is the tier1 agents slowly fill the channel and eventually start refusing put attempts from the application servers. It seems that once a given batch has been allocated to the bad sink, it won't ever get released to be processed by the other, working sink.

Is this the way it should work? Is this a defect or as designed? I will probably switch to a failover processor because I really only need one HDFS writer to keep up with my data, but I do think this isn't working as intended.


JR 2013-04-01, 02:12
Jeff Lord 2013-04-01, 18:57
Paul Chavez 2013-04-01, 20:06
Paul Chavez 2013-04-01, 19:23