Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Block Under-replication detected. Rotating file.


Copy link to this message
-
Block Under-replication detected. Rotating file.
Hi,

I have Flume agent with spool directory as source and HDFS sink. I have
configured sink to roll files only when they reach some (quite large) size
(see full config below). However, when I *restart* Flume, it first
generates ~15 small files (~500 bytes) and only after that starts writing
large file. In Flume logs at the time of generating small files I see
message "Block Under-replication detected. Rotating file".

>From source code I've figured out several things:

1. This message is specific to Flume 1.3 and doesn't exist in latest
version.
2. It comes from BlockWriter.shouldRotate() methid which in its turn calls
HDFSWriter.isUnderReplicated(), and if it returns true, above message is
generated and files is rotated.

My questions are: why it happens and how do I fix it?
Flume 1.3 CDH 4.3

flume.config
-----------------

agent.sources = my-src
agent.channels = my-ch
agent.sinks = my-sink

agent.sources.my-src.type = spooldir
agent.sources.my-src.spoolDir = /flume/data
agent.sources.my-src.channels = my-ch
agent.sources.my-src.deletePolicy = immediate
agent.sources.my-src.interceptors = tstamp-int
agent.sources.my-src.interceptors.tstamp-int.type = timestamp

agent.channels.my-ch.type = file
agent.channels.my-ch.checkpointDir = /flume/checkpoint
agent.channels.my-ch.dataDirs = /flume/channel-data

agent.sinks.my-sink.type = hdfs
agent.sinks.my-sink.hdfs.path = hdfs://my-hdfs:8020/logs
agent.sinks.my-sink.hdfs.filePrefix = Log
agent.sinks.my-sink.hdfs.batchSize = 10
agent.sinks.my-sink.hdfs.rollInterval = 3600
agent.sinks.my-sink.hdfs.rollCount = 0
agent.sinks.my-sink.hdfs.rollSize = 134217728
agent.sinks.my-sink.hdfs.fileType = DataStream
agent.sinks.my-sink.channel = my-ch
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB