Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - spooling directory source and variable replacement


Copy link to this message
-
Re: spooling directory source and variable replacement
Frank Maritato 2013-06-06, 18:29
Thanks Phil. If I end up building my own, I'll contribute back so if other people want to be able to have date placeholders in directories they can use it.

--
Frank Maritato
[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
On Jun 6, 2013, at 11:20 AM, Phil Scala <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
 wrote:

Frank,  as Nitin said currently flume monitors only the spool director, no child directories are monitored.

If it’s any consolation , I have a working patch to the Flume 1.4 code base for an enhancement to add support for sub directories (FLUME-1899<https://issues.apache.org/jira/browse/FLUME-1899>).  But for now Nitin has it right.

Thanks
Phil
From: Nitin Pawar [mailto:[EMAIL PROTECTED]<http://gmail.com>]
Sent: Tuesday, June 04, 2013 12:56 AM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: spooling directory source and variable replacement

Frank,
in spooling directory , it will always pick up all the new files dropped into directory. Be sure that you do not have files which are still being written into in the same directory.

In normal use cases, they have a staging directory where they have current on going log writing. And then you can use logrotate to move the files from your log directory to spooling directory. Spooling directory requires the directory from which it needs to pick up files and you can not put a file name in the config (as far as I know)

If you want to concentrate on those filenames only then I would suggest to only move those files into the spooling directory.

On Tue, Jun 4, 2013 at 2:11 AM, Frank Maritato <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi All,

The application I'd like to grab log files from is rotating them into subdirectories by time stamp. For example,

/mnt/remote/application_name/yyyy/mm/dd/hh/[filename]-[timestamp].gz

Is there any way to configure the spooling directory source in flume with time variables such that it can find these files? Or is there a better way to do this?

Thanks
--
Frank Maritato
--
Nitin Pawar