Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> gz compressed files and Spool Dir Source


Copy link to this message
-
gz compressed files and Spool Dir Source
Hello,

I'm triying to pick gz files from Spool Dir and the following error is
showed:

ERROR source.SpoolDirectorySource: FATAL: Spool Directory source
spoolDirSource: { spoolDir: /home/hadoop/spoolDir }: Uncaught exception in
SpoolDirectorySource thread. Restart or reconfigure Flume to continue
processing.
java.nio.charset.MalformedInputException: Input length = 1

If I descompress the file flume agent source pick correctly.
Is it possible to use Spool Dir with gz compressed files? Or I have to
descompress it before copy them in spool directory.

My config:

# Source properties
client.sources.spoolDirSource.type = spooldir
client.sources.spoolDirSource.channels = memoryChannel
client.sources.spoolDirSource.spoolDir = /home/hadoop/spoolDir
# When a file was been transmitted will be renamed with fileSuffix
client.sources.spoolDirSource.fileSuffix = .DONE
# Default values for other properties
client.sources.spoolDirSource.fileHeader = false
client.sources.spoolDirSource.fileHeaderKey = file
# Number of events, higher-> better troughput, larger rollback in fail.
client.sources.spoolDirSource.batchSize = 100
# Buffer max size = bufferMaxLines * maxBufferLineLength
client.sources.spoolDirSource.bufferMaxLines = 100
client.sources.spoolDirSource.maxBufferLineLength = 5000
# Clean Files after procces
#client.sources.spoolDirSource.deletePolicy = immediate

# Add timestamp to header
client.sources.spoolDirSource.interceptors = t1
client.sources.spoolDirSource.interceptors.t1.type = timestamp

Thanks in advance!!

~ Marcelo
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB