Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Error while writing to HDFS


Copy link to this message
-
Error while writing to HDFS
Hi

I starting noticing the following error on our flume nodes and was
wondering if anyone had any ideas. I am still trying to figure out if its
related to something happening in our Hadoop cluster.

I am running about 20 of the following sink configurations

${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.type = hdfs
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.path hdfs://${HADOOP_NAMENODE}:8020/rawLogs/%Y-%m-%d/%H00
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.codeC com.hadoop.compression.lzo.LzopCodec
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.fileType = CompressedStream
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.rollInterval = 300
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.rollSize = 0
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.rollCount = 0
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.batchSize = 2000
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.callTimeout = 60000
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.hdfs.filePrefix ${FLUME_COLLECTOR_ID}_20
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.txnEventMax = 1000
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.serializer gr_flume_utils.serializer.JSONEventSerializer$Builder
${FLUME_COLLECTOR_ID}.sinks.hdfs-sink20.channel = hdfs-fileChannel
2012-11-06 02:21:03,098 [hdfs-hdfs-sink11-call-runner-5] WARN
 org.apache.hadoop.hdfs.DFSClient - Error while syncing
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:298)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3671)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3594)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2400(DFSClient.java:2792)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2987)
2012-11-06 02:21:03,098 [hdfs-hdfs-sink11-call-runner-5] WARN
 org.apache.flume.sink.hdfs.BucketWriter - Caught IOException while closing
file (hdfs://
van-mang-perf-hadoop-relay.net:8020/rawLogs/2012-10-28/1700/van-mang-perf-flume-collector2-relay-net_11.1352168380390.lzo.tmp).
Exception follows.
java.io.EOFException
at java.io.DataInputStream.readShort(DataInputStream.java:298)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3671)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3594)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2400(DFSClient.java:2792)
 at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2987)
2012-11-06 02:21:03,105 [SinkRunner-PollingRunner-DefaultSinkProcessor]
WARN  org.apache.flume.sink.hdfs.HDFSEventSink - HDFS IO error
java.io.IOException: write beyond end of stream
at
com.hadoop.compression.lzo.LzopOutputStream.write(LzopOutputStream.java:127)
at java.io.OutputStream.write(OutputStream.java:58)
 at
org.apache.flume.sink.hdfs.HDFSCompressedDataStream.append(HDFSCompressedDataStream.java:81)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:328)
 at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:711)
at org.apache.flume.sink.hdfs.HDFSEventSink$1.call(HDFSEventSink.java:708)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
Thanks

Cameron Gandevia
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB