Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Netcat source stops processing data

Copy link to this message
Re: Netcat source stops processing data

Are you reading the responses sent by the net cat source? If you don't read the OK sent by net cat source on your application side, your application's buffer gets full causing the net cat source to queue up stuff and eventually die. It is something we need to fix, but I don't know if anyone is using net cat in production - you should probably test using Avro source or the new HTTP source(for this you would need to build trunk/1.3 branch or wait for 1.3 release).

Hari Shreedharan
On Thursday, November 8, 2012 at 3:05 PM, Rahul Ravindran wrote:

> Hello,
>   I wanted to perform a load test to get an idea of how we would look to scale flume for our deployment. I have pasted the config file at the source below. I have a netcat source which is listening on a port and have 2 channels, 2 avro sinks consuming the events from the netcat source.
> My load generator is a simple C program which is continually sending 20 characters in each message using a socket, and send(). I notice that , initially, a lot of traffic makes it through and then the flume agent appears to stop consuming data(after about 80k messages). This results in the tcp receive and send buffer being full. I understand that the rate at which I am generating traffic may overwhelm flume, but I would expect it to gradually consume data. It does not consume any more messages. I looked through the flume logs and did not see anything there (no stack trace). I ran tcpdump and see that the receive window initially is non-zero but begins to decrease and then goes down to zero, and very slowly opens up to a size of 1 (once in 10 seconds)
> Could you help on what may be going on or if there is something wrong with my config?
> agent1.channels.ch1.type = MEMORY
> agent1.channels.ch1.capacity = 50000
> agent1.channels.ch1.transactionCapacity = 5000
> agent1.sources.netcat.channels = ch1
> agent1.sources.netcat.type= netcat
> agent1.sources.netcat.bind =
> agent1.sources.netcat.port = 44444
> agent1.sinks.avroSink1.type = avro
> agent1.sinks.avroSink1.channel = ch1
> agent1.sinks.avroSink1.hostname = <remote hostname>
> agent1.sinks.avroSink1.port = 4545
> agent1.sinks.avroSink1.connect-timeout = 300000
> agent1.sinks.avroSink2.type = avro
> agent1.sinks.avroSink2.channel = ch1
> agent1.sinks.avroSink2.hostname = <remote hostname>
> agent1.sinks.avroSink2.port = 4546
> agent1.sinks.avroSink2.connect-timeout = 300000
> agent1.channels = ch1
> agent1.sources = netcat
> agent1.sinks = avroSink1 avroSink2 avroSink2