Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> flume.EventDeliveryException: Failed to send events


Copy link to this message
-
flume.EventDeliveryException: Failed to send events
We are running Flume-NG 1.3.1 and have noticed periodically the following
ERROR occurring (a few times daily):

We are using the File Channel connecting to 2 downstream collector agents
in 'round_robin' mode, using avro source/sinks.

We are using the config described below to deliver 5 different log types
(to 5 different ports downstream) and have observed the below
error occurring randomly across all the ports.

We tried doubling the connect-timeout to 40000 (from the default of 20000)
with no success.
The agent appears to recover and keep on processing data.

My question is:
Has this data been lost? or will flume eventually retry until a successfull
delivery has been made?
Are there any other config changes I can make to prevent/reduce
this occurring in the future?

05 Feb 2013 23:12:21,650 ERROR
[SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.SinkRunner$PollingRunner.run:160)  - Unable to deliver
event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to send events
        at org.apache.flume.sink.AvroSink.process(AvroSink.java:325)
        at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
host: collector1, port: 4003 }: Failed to send batch
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:236)
        at org.apache.flume.sink.AvroSink.process(AvroSink.java:309)
        ... 3 more
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
host: collector2, port: 4003 }: RPC request timed out
        at
org.apache.flume.api.NettyAvroRpcClient.waitForStatusOK(NettyAvroRpcClient.java:321)
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:295)
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:224)
        ... 4 more
Caused by: java.util.concurrent.TimeoutException
        at org.apache.avro.ipc.CallFuture.get(CallFuture.java:132)
        at
org.apache.flume.api.NettyAvroRpcClient.waitForStatusOK(NettyAvroRpcClient.java:310)
        ... 6 more

Below is a snapshot the current config:

agent.sources.eventdata.command = tail -qn +0 -F
/var/log/event-logs/live/eventdata.log
agent.sources.eventdata.channels = eventdata-avro-channel
agent.sources.eventdata.batchSize = 10
agent.sources.eventdata.restart = true

## event source interceptor
agent.sources.eventdata.interceptors.host-interceptor.type org.apache.flume.interceptor.HostInterceptor$Builder
agent.sources.eventdata.interceptors.host-interceptor.hostHeader source-host
agent.sources.eventdata.interceptors.host-interceptor.useIP = false

## eventdata channel
agent.channels.eventdata-avro-channel.type = file
agent.channels.eventdata-avro-channel.checkpointDir /mnt/flume-ng/checkpoint/eventdata-avro-channel
agent.channels.eventdata-avro-channel.dataDirs /mnt/flume-ng/data1/eventdata-avro-channel,/mnt/flume-ng/data2/eventdata-avro-channel
agent.channels.eventdata-avro-channel.maxFileSize = 210000000
agent.channels.eventdata-avro-channel.capacity = 2000000
agent.channels.eventdata-avro-channel.checkpointInterval = 300000
agent.channels.eventdata-avro-channel.transactionCapacity = 10000
agent.channels.eventdata-avro-channel.keep-alive = 20
agent.channels.eventdata-avro-channel.write-timeout = 20

## 2 x downstream click sinks for load balancing and failover
agent.sinks.eventdata-avro-sink-1.type = avro
agent.sinks.eventdata-avro-sink-1.channel = eventdata-avro-channel
agent.sinks.eventdata-avro-sink-1.hostname = collector1
agent.sinks.eventdata-avro-sink-1.port = 4003
agent.sinks.eventdata-avro-sink-1.batch-size = 100
agent.sinks.eventdata-avro-sink-1.connect-timeout = 40000

agent.sinks.eventdata-avro-sink-2.type = avro
agent.sinks.eventdata-avro-sink-2.channel = eventdata-avro-channel
agent.sinks.eventdata-avro-sink-2.hostname = collector2
agent.sinks.eventdata-avro-sink-2.port = 4003
agent.sinks.eventdata-avro-sink-2.batch-size = 100
agent.sinks.eventdata-avro-sink-2.connect-timeout = 40000

## load balance config
agent.sinkgroups = eventdata-avro-sink-group
agent.sinkgroups.eventdata-avro-sink-group.sinks = eventdata-avro-sink-1
eventdata-avro-sink-2
agent.sinkgroups.eventdata-avro-sink-group.processor.type = load_balance
agent.sinkgroups.eventdata-avro-sink-group.processor.selector = round_robin

Denis.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB