Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> flume.EventDeliveryException: Failed to send events


+
Denis Lowe 2013-02-06, 06:23
+
Juhani Connolly 2013-02-06, 08:21
+
Denis Lowe 2013-02-06, 18:10
Copy link to this message
-
Re: flume.EventDeliveryException: Failed to send events
I'm seeing the same thing :)

Mine is all on a local LAN though, so the fact that an RPC call doesn't
reply in 10000ms or 20000ms is quite odd.  My configuration is for the most
part the same as Denis' configuration.  Two tiered system, ExecSources
running tail -F on log files to an AvroSink, to an AvroSource, loading into
HDFS on the back tier.

I, too, see this on the AvroSink

Either (A):
[2013-04-15 23:57:14.827]
[org.apache.flume.sink.LoadBalancingSinkProcessor] [ WARN]
[SinkRunner-PollingRunner-LoadBalancingSinkProcessor] []
 (LoadBalancingSinkProcessor.java:process:154) Sink failed to consume
event. Attempting next sink if available.
org.apache.flume.EventDeliveryException: Failed to send events
        at org.apache.flume.sink.AvroSink.process(AvroSink.java:324)
        at
org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:151)
        at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
host: hadoopjt01.pegs.com, port: 10000 }: Failed to send batch
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:236)
        at org.apache.flume.sink.AvroSink.process(AvroSink.java:308)
        ... 3 more
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
host: hadoopjt01.pegs.com, port: 10000 }: Handshake timed out after 20000ms
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:280)
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:224)
        ... 4 more
Caused by: java.util.concurrent.TimeoutException
        at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:228)
        at java.util.concurrent.FutureTask.get(FutureTask.java:91)
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:278)
        ... 5 more

or (B):
[2013-04-15 19:49:01.135]
[org.apache.flume.sink.LoadBalancingSinkProcessor] [ WARN]
[SinkRunner-PollingRunner-LoadBalancingSinkProcessor] []
 (LoadBalancingSinkProcessor.java:process:154) Sink failed to consume
event. Attempting next sink if available.
org.apache.flume.EventDeliveryException: Failed to send events
        at org.apache.flume.sink.AvroSink.process(AvroSink.java:324)
        at
org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:151)
        at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
host: hadoopjt01.pegs.com, port: 10000 }: Failed to send batch
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:236)
        at org.apache.flume.sink.AvroSink.process(AvroSink.java:308)
        ... 3 more
Caused by: org.apache.flume.EventDeliveryException: NettyAvroRpcClient {
host: hadoopjt01.pegs.com, port: 10000 }: RPC request timed out
        at
org.apache.flume.api.NettyAvroRpcClient.waitForStatusOK(NettyAvroRpcClient.java:321)
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:295)
        at
org.apache.flume.api.NettyAvroRpcClient.appendBatch(NettyAvroRpcClient.java:224)
        ... 4 more
Caused by: java.util.concurrent.TimeoutException
        at org.apache.avro.ipc.CallFuture.get(CallFuture.java:132)
        at
org.apache.flume.api.NettyAvroRpcClient.waitForStatusOK(NettyAvroRpcClient.java:310)
        ... 6 more

The only thing I see on the AvroSource tier is the disconnect/reconnect
happening:

[2013-04-15 19:49:00.992] [org.apache.avro.ipc.NettyServer] [ INFO]
[pool-11-thread-10] []  (NettyServer.java:handleUpstream:171) [id:
0x2a24ed78, /138.113.127.4:34481 :> /138.113.127.72:10000] DISCONNECTED
[2013-04-15 19:49:00.992] [org.apache.avro.ipc.NettyServer] [ INFO]
[pool-11-thread-10] []  (NettyServer.java:handleUpstream:171) [id:
0x2a24ed78, /138.113.127.4:34481 :> /138.113.127.72:10000] UNBOUND
[2013-04-15 19:49:00.992] [org.apache.avro.ipc.NettyServer] [ INFO]
[pool-11-thread-10] []  (NettyServer.java:handleUpstream:171) [id:
0x2a24ed78, /138.113.127.4:34481 :> /138.113.127.72:10000] CLOSED
[2013-04-15 19:49:00.993] [org.apache.avro.ipc.NettyServer] [ INFO]
[pool-11-thread-10] []  (NettyServer.java:channelClosed:209) Connection to /
138.113.127.4:34481 disconnected.
[2013-04-15 19:49:03.331] [org.apache.avro.ipc.NettyServer] [ INFO]
[pool-10-thread-1] []  (NettyServer.java:handleUpstream:171) [id:
0x3883b82e, /138.113.127.4:62442 => /138.113.127.72:10000] OPEN
[2013-04-15 19:49:03.332] [org.apache.avro.ipc.NettyServer] [ INFO]
[pool-11-thread-13] []  (NettyServer.java:handleUpstream:171) [id:
0x3883b82e, /138.113.127.4:62442 => /138.113.127.72:10000] BOUND: /
138.113.127.72:10000
[2013-04-15 19:49:03.333] [org.apache.avro.ipc.NettyServer] [ INFO]
[pool-11-thread-13] []  (NettyServer.java:handleUpstream:171) [id:
0x3883b82e, /138.113.127.4:62442 => /138.113.127.72:10000] CONNECTED: /
138.113.127.4:62442

Is there some way to determine exactly where this bottleneck is?  The
config options for AvroSource/Sink are quite short, so there is not much
tuning to do.  Here is what I have:

# avro-hadoopjt01-sink properties
udprodae01.sinks.avro-hadoopjt01-sink.type = avro
udprodae01.sinks.avro-hadoopjt01-sink.hostname = hadoopjt01.pegs.com
udprodae01.sinks.avro-hadoopjt01-sink.port = 10000
udprodae01.sinks.avro-hadoopjt01-sink.batch-size = 100

# avro-hadoopjt01-source properties
hadoopjt01-1.sources.avro-hadoopjt01-source.type = avro
hadoopjt01-1.sources.avro-hadoopjt01-source.bind = hadoopjt01.pegs.com
hadoopjt01-1.sources.avro-hadoopjt01-source.port = 10000
hadoopjt01-1.sources.avro-hadoopjt01-source.threads = 64

I can try increasing the AvroSink timeout values, but they seem quite
adequate at the defaults.  Maybe more threads on the AvroSource?

Any advice would be much a
+
Hari Shreedharan 2013-04-16, 18:52
+
Chris Neal 2013-04-16, 19:07
+
Hari Shreedharan 2013-04-16, 19:19
+
Brock Noland 2013-04-16, 19:26
+
Chris Neal 2013-04-16, 19:49
+
Brock Noland 2013-04-16, 19:57
+
Chris Neal 2013-04-16, 20:05