Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> A customer use case / using spoolDir


Copy link to this message
-
A customer use case / using spoolDir
Hello,
thank you for the hint to use the new spoolDir feature in the fresh released 1.3.0 version of Flume.

unfortunately I am not getting the expected result.
Here is my configuration:

agent1.channels = MemoryChannel-2
agent1.channels.MemoryChannel-2.type = memory

agent1.sources = spooldir-1
agent1.sources.spooldir-1.type = spooldir
agent1.sources.spooldir-1.spoolDir = /opt/apache2/logs/flumeSpool
agent1.sources.spooldir-1.fileHeader = true

agent1.sinks = HDFS
agent1.sinks.HDFS.channel = MemoryChannel-2
agent1.sinks.HDFS.type = hdfs
agent1.sinks.HDFS.hdfs.fileType = DataStream
agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000
agent1.sinks.HDFS.hdfs.writeFormat = Text
Upon start I am getting the following warning:
2012-12-05 11:05:19,216 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:571)] Removed spooldir-1 due to No Channels configured for spooldir-1

Question:

1) Is something wrong in the above config?

2) How are the files gathered from the spool directory? Every time I drop (copy, etc...) a file in it?

3) What happens to the files that were already in the spool directory before I start the flume agent?

I would appreciate any Help!

Cheers,
Emile
-------- Original-Nachricht --------
> Datum: Tue, 4 Dec 2012 06:48:46 -0800
> Von: Mike Percy <[EMAIL PROTECTED]>
> An: [EMAIL PROTECTED]
> Betreff: Re: A customer use case

> Hi Emile,
>
> On Tue, Dec 4, 2012 at 2:04 AM, Emile Kao <[EMAIL PROTECTED]> wrote:
> >
> > 1. Which is the best way to implement such a scenario using Flume/
> Hadoop?
> >
>
> You could use the file spooling client / source to stream these files back
> in the latest trunk and upcoming Flume 1.3.0 builds, along with hdfs sink.
>
> 2. The customer would like to keep the log files in thier original state
> > (file name, size, etc..). Is it practicable using Flume?
> >
>
> Not recommended. Flume is an event streaming system, not a file copying
> mechanism. If you want to do that, just use some scripts with hadoop fs
> -put instead of Flume. Flume provides a bunch of stream-oriented features
> on top of its event streaming architecture, such as data enrichment
> capabilities, event routing, and configurable file rolling on HDFS, to
> name
> a few.
>
> Regards,
> Mike
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB