Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Netcat source stops processing data


Copy link to this message
-
Netcat source stops processing data
Hello,
  I wanted to perform a load test to get an idea of how we would look to scale flume for our deployment. I have pasted the config file at the source below. I have a netcat source which is listening on a port and have 2 channels, 2 avro sinks consuming the events from the netcat source.

My load generator is a simple C program which is continually sending 20 characters in each message using a socket, and send(). I notice that , initially, a lot of traffic makes it through and then the flume agent appears to stop consuming data(after about 80k messages). This results in the tcp receive and send buffer being full. I understand that the rate at which I am generating traffic may overwhelm flume, but I would expect it to gradually consume data. It does not consume any more messages. I looked through the flume logs and did not see anything there (no stack trace). I ran tcpdump and see that the receive window initially is non-zero but begins to decrease and then goes down to zero, and very slowly opens up to a size of 1 (once in 10 seconds)

Could you help on what may be going on or if there is something wrong with my config?

agent1.channels.ch1.type = MEMORY
agent1.channels.ch1.capacity = 50000
agent1.channels.ch1.transactionCapacity = 5000

agent1.sources.netcat.channels = ch1
agent1.sources.netcat.type= netcat
agent1.sources.netcat.bind = 127.0.0.1
agent1.sources.netcat.port = 44444

agent1.sinks.avroSink1.type = avro
agent1.sinks.avroSink1.channel = ch1
agent1.sinks.avroSink1.hostname = <remote hostname>
agent1.sinks.avroSink1.port = 4545
agent1.sinks.avroSink1.connect-timeout = 300000
agent1.sinks.avroSink2.type = avro
agent1.sinks.avroSink2.channel = ch1
agent1.sinks.avroSink2.hostname = <remote hostname>
agent1.sinks.avroSink2.port = 4546
agent1.sinks.avroSink2.connect-timeout = 300000

agent1.channels = ch1
agent1.sources = netcat
agent1.sinks = avroSink1 avroSink2 avroSink2
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB