Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Help with spooling directory source


Copy link to this message
-
Re: Help with spooling directory source
Thank you.
On Tue, Feb 19, 2013 at 9:20 PM, Brock Noland <[EMAIL PROTECTED]> wrote:

> Hi,
>
> The spooling fir source expects immutable, uniquely named files
> as described here:
>
> http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source
>
> As such you should log to a separate directory and then on roll move the
> file (uniquely named) into the spooling dir source.
>
> Brock
>
>
> On Tue, Feb 19, 2013 at 5:08 AM, Robert George <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I'm trying to use flumes Spooling Directory Source to move my logs from
>> ec2 instances to aws s3.
>>
>>
>> I'm using rotatelogs to create log files of 5MB sizes.
>>
>> I have one doubt. If I specify the spooldir sources directory as the
>> directory to which the apache creates log, will it work correctly.
>>
>> What I mean is, will flume wait till apache completes writing to the log
>> file?
>>
>> What I observe is, only couple of lines of the log file is coming in the
>> sink, but in the source side has more lines.
>>
>> I dont whether I put my question is correctly worder.
>>
>> My configruation files below.
>>
>> Apache configuration
>> -----------------------------
>>
>> CustomLog "|/opt/bitnami/apache2/bin/rotatelogs -l
>> /mnt/je/logs/apache/jesites/epicenter/access/%Y-%m-%d-%H-%M-%S-access.log
>> 5M" cookie env=!dontlog
>> CustomLog "|/opt/bitnami/apache2/bin/rotatelogs -l
>> /mnt/je/logs/apache/jesites/epicenter/access/%Y-%m-%d-%H-%M-%S-access_assets.log
>> 5M" combined env=dontlog
>> ErrorLog "|/opt/bitnami/apache2/bin/rotatelogs -l
>> /mnt/je/logs/apache/jesites/epicenter/error/%Y-%m-%d-%H-%M-%S-errorlog.log
>> 5M"
>>
>>
>> Flume configuration
>> ---------------------------
>> #source is of type spooling directory - epicenter access logs
>> agent1.sources.spooldir-epi-access.channels = ch1
>> agent1.sources.spooldir-epi-access.type = spooldir
>> agent1.sources.spooldir-epi-access.spoolDir >> /mnt/je/logs/apache/jesites/epicenter/access
>> agent1.sources.spooldir-epi-access.interceptors = i1 hostname type
>> agent1.sources.spooldir-epi-access.interceptors.i1.type = timestamp
>> agent1.sources.spooldir-epi-access.interceptors.hostname.type = host
>> agent1.sources.spooldir-epi-access.interceptors.hostname.useIP = false
>> agent1.sources.spooldir-epi-access.interceptors.hostname.preserveExisting
>> = true
>> agent1.sources.spooldir-epi-access.interceptors.type.type = static
>> agent1.sources.spooldir-epi-access.interceptors.type.key = type
>> agent1.sources.spooldir-epi-access.interceptors.type.value = epi-access
>> agent1.sources.spooldir-epi-access.fileHeader = true
>>
>>
>> --
>> Regards,
>>
>> Robert George
>>
>> http://justEat.in | [EMAIL PROTECTED] | +919986442677
>>
>>
>>
>
>
> --
> Apache MRUnit - Unit testing MapReduce -
> http://incubator.apache.org/mrunit/
>

--
Regards,

Robert George

http://justEat.in | [EMAIL PROTECTED] | +919986442677
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB