-Re: A customer use case / using spoolDir
Patrick Wendell 2012-12-07, 00:50
To answer your other questions: The spooling source will pick up files
in the directory, send them with Flume, and rename them to indicate
that they have been transferred. Files that were already in the
directory before you started will be read and sent through Flume. It
treats these like any other files.
On Wed, Dec 5, 2012 at 4:34 AM, Alexander Alten-Lorenz
<[EMAIL PROTECTED]> wrote:
> as the error message says:
>> No Channels configured for spooldir-1
> agent1.sources.spooldir-1.channels = MemoryChannel-2
> When a file is dropped into the source should pick up them. If are files inside they will be processed (if I'm not totally wrong)
> - Alex
> On Dec 5, 2012, at 1:00 PM, Emile Kao <[EMAIL PROTECTED]> wrote:
>> thank you for the hint to use the new spoolDir feature in the fresh released 1.3.0 version of Flume.
>> unfortunately I am not getting the expected result.
>> Here is my configuration:
>> agent1.channels = MemoryChannel-2
>> agent1.channels.MemoryChannel-2.type = memory
>> agent1.sources = spooldir-1
>> agent1.sources.spooldir-1.type = spooldir
>> agent1.sources.spooldir-1.spoolDir = /opt/apache2/logs/flumeSpool
>> agent1.sources.spooldir-1.fileHeader = true
>> agent1.sinks = HDFS
>> agent1.sinks.HDFS.channel = MemoryChannel-2
>> agent1.sinks.HDFS.type = hdfs
>> agent1.sinks.HDFS.hdfs.fileType = DataStream
>> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000
>> agent1.sinks.HDFS.hdfs.writeFormat = Text
>> Upon start I am getting the following warning:
>> 2012-12-05 11:05:19,216 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:571)] Removed spooldir-1 due to No Channels configured for spooldir-1
>> 1) Is something wrong in the above config?
>> 2) How are the files gathered from the spool directory? Every time I drop (copy, etc...) a file in it?
>> 3) What happens to the files that were already in the spool directory before I start the flume agent?
>> I would appreciate any Help!
>> -------- Original-Nachricht --------
>>> Datum: Tue, 4 Dec 2012 06:48:46 -0800
>>> Von: Mike Percy <[EMAIL PROTECTED]>
>>> An: [EMAIL PROTECTED]
>>> Betreff: Re: A customer use case
>>> Hi Emile,
>>> On Tue, Dec 4, 2012 at 2:04 AM, Emile Kao <[EMAIL PROTECTED]> wrote:
>>>> 1. Which is the best way to implement such a scenario using Flume/
>>> You could use the file spooling client / source to stream these files back
>>> in the latest trunk and upcoming Flume 1.3.0 builds, along with hdfs sink.
>>> 2. The customer would like to keep the log files in thier original state
>>>> (file name, size, etc..). Is it practicable using Flume?
>>> Not recommended. Flume is an event streaming system, not a file copying
>>> mechanism. If you want to do that, just use some scripts with hadoop fs
>>> -put instead of Flume. Flume provides a bunch of stream-oriented features
>>> on top of its event streaming architecture, such as data enrichment
>>> capabilities, event routing, and configurable file rolling on HDFS, to
>>> a few.
> Alexander Alten-Lorenz
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF