Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - gz compressed files and Spool Dir Source


Copy link to this message
-
gz compressed files and Spool Dir Source
Marcelo Valle 2013-11-27, 16:35
Hello,

I'm triying to pick gz files from Spool Dir and the following error is
showed:

ERROR source.SpoolDirectorySource: FATAL: Spool Directory source
spoolDirSource: { spoolDir: /home/hadoop/spoolDir }: Uncaught exception in
SpoolDirectorySource thread. Restart or reconfigure Flume to continue
processing.
java.nio.charset.MalformedInputException: Input length = 1

If I descompress the file flume agent source pick correctly.
Is it possible to use Spool Dir with gz compressed files? Or I have to
descompress it before copy them in spool directory.

My config:

# Source properties
client.sources.spoolDirSource.type = spooldir
client.sources.spoolDirSource.channels = memoryChannel
client.sources.spoolDirSource.spoolDir = /home/hadoop/spoolDir
# When a file was been transmitted will be renamed with fileSuffix
client.sources.spoolDirSource.fileSuffix = .DONE
# Default values for other properties
client.sources.spoolDirSource.fileHeader = false
client.sources.spoolDirSource.fileHeaderKey = file
# Number of events, higher-> better troughput, larger rollback in fail.
client.sources.spoolDirSource.batchSize = 100
# Buffer max size = bufferMaxLines * maxBufferLineLength
client.sources.spoolDirSource.bufferMaxLines = 100
client.sources.spoolDirSource.maxBufferLineLength = 5000
# Clean Files after procces
#client.sources.spoolDirSource.deletePolicy = immediate

# Add timestamp to header
client.sources.spoolDirSource.interceptors = t1
client.sources.spoolDirSource.interceptors.t1.type = timestamp

Thanks in advance!!

~ Marcelo