Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> gz compressed files and Spool Dir Source

Copy link to this message
gz compressed files and Spool Dir Source

I'm triying to pick gz files from Spool Dir and the following error is

ERROR source.SpoolDirectorySource: FATAL: Spool Directory source
spoolDirSource: { spoolDir: /home/hadoop/spoolDir }: Uncaught exception in
SpoolDirectorySource thread. Restart or reconfigure Flume to continue
java.nio.charset.MalformedInputException: Input length = 1

If I descompress the file flume agent source pick correctly.
Is it possible to use Spool Dir with gz compressed files? Or I have to
descompress it before copy them in spool directory.

My config:

# Source properties
client.sources.spoolDirSource.type = spooldir
client.sources.spoolDirSource.channels = memoryChannel
client.sources.spoolDirSource.spoolDir = /home/hadoop/spoolDir
# When a file was been transmitted will be renamed with fileSuffix
client.sources.spoolDirSource.fileSuffix = .DONE
# Default values for other properties
client.sources.spoolDirSource.fileHeader = false
client.sources.spoolDirSource.fileHeaderKey = file
# Number of events, higher-> better troughput, larger rollback in fail.
client.sources.spoolDirSource.batchSize = 100
# Buffer max size = bufferMaxLines * maxBufferLineLength
client.sources.spoolDirSource.bufferMaxLines = 100
client.sources.spoolDirSource.maxBufferLineLength = 5000
# Clean Files after procces
#client.sources.spoolDirSource.deletePolicy = immediate

# Add timestamp to header
client.sources.spoolDirSource.interceptors = t1
client.sources.spoolDirSource.interceptors.t1.type = timestamp

Thanks in advance!!

~ Marcelo