Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume


hello,
my config :

- flume-ng 1.4
- hadoop 1.0.3

when we reboot the namenode, flume agent log theses errors and don't reattach to hdfs when the namenode is back.
The only solution is to restart the flume agent.

What can i do to make the agent autoreconnect ?

agent config :

LOG.sinks.sinkHDFS.type = hdfs
LOG.sinks.sinkHDFS.hdfs.fileType = DataStream
LOG.sinks.sinkHDFS.hdfs.path = hdfs://server1:57001/user/PROD/WB/%Y-%m-%d/%H-%M
LOG.sinks.sinkHDFS.hdfs.filePrefix = weblo
LOG.sinks.sinkHDFS.hdfs.fileSuffix = .log
LOG.sinks.sinkHDFS.hdfs.rollInterval = 600
LOG.sinks.sinkHDFS.hdfs.rollSize = 100000000
LOG.sinks.sinkHDFS.hdfs.rollCount = 0
LOG.sinks.sinkHDFS.hdfs.idleTimeout = 60
LOG.sinks.sinkHDFS.hdfs.round = true
LOG.sinks.sinkHDFS.hdfs.roundUnit = minute
LOG.sinks.sinkHDFS.hdfs.roundValue = 10

agent log :
14/04/01 09:00:00 INFO hdfs.BucketWriter: Renaming hdfs://server1:57001/user/PROD/WB/2014-04-01/08-50/weblo.1396335000014.log.tmp to hdfs://server1:57001/user/PROD/WB/2014-04-01/08-50/weblo.1396335000014.log
14/04/01 09:00:26 WARN hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block blk_2203194302939327399_13019104java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readLong(DataInputStream.java:399)
        at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2967)

14/04/01 09:00:26 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 bad datanode[0] 210.10.44.22:50010
14/04/01 09:00:26 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 in pipeline 210.10.44.22:50010, 210.10.44.29:50010, 210.10.44.21:50010: bad datanode 210.10.44.22:50010
14/04/01 09:00:27 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 0 time(s).
14/04/01 09:00:28 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 1 time(s).
14/04/01 09:00:29 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 2 time(s).
14/04/01 09:00:30 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 3 time(s).
14/04/01 09:00:31 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 4 time(s).
14/04/01 09:00:32 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 5 time(s).
14/04/01 09:00:33 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 6 time(s).
14/04/01 09:00:34 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 7 time(s).
14/04/01 09:00:35 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 8 time(s).
14/04/01 09:00:36 INFO ipc.Client: Retrying connect to server: hadoopserver1/210.10.44.29:50020. Already tried 9 time(s).
14/04/01 09:00:36 WARN hdfs.DFSClient: Failed recovery attempt #0 from primary datanode 210.10.44.29:50010
java.net.ConnectException: Call to hadoopserver1/210.10.44.29:50020 failed on connection exception: java.net.ConnectException: Connection refused
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
        at org.apache.hadoop.ipc.Client.call(Client.java:1075)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
        at $Proxy7.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
        at org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:160)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3120)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
        at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:489)
        at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1206)
        at org.apache.hadoop.ipc.Client.call(Client.java:1050)
        ... 7 more
14/04/01 09:00:36 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 failed  because recovery from primary datanode 210.10.44.29:50010 failed 1 times.  Pipeline was 210.10.44.22:50010, 210.10.44.29:50010, 210.10.44.21:50010. Will retry...
14/04/01 09:00:36 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 bad datanode[0] 210.10.44.22:50010
14/04/01 09:00:36 WARN hdfs.DFSClient: Error Recovery for block blk_2203194302939327399_13019104 in pipeline 210.10.44.22:50010, 210.10.44.29:50010, 210.10.44.21:50010: bad datanode 210.10.44.22:50010
14/04/01 09:07:40 INFO ipc.Client: Retrying connect to server: server1/210.10.44.26:57001. Already tried 8 time(s).
14/04/01 09:07:41 INFO ipc.Client: Retrying connect to server: server1/210.10.44.26:57001. Already tried 9 time(s).
14/04/01 09:07:41 WARN hdfs.DFSClient: Problem renewing lease for DFSClient_893652616
java.net.ConnectException: Call to server1/210.10.44.26:57001 failed on connection exception: java.net.ConnectException: Connection refused
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1099)
        at org.apach
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB