Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume not moving data to HDFS or local


Copy link to this message
-
Re: Flume not moving data to HDFS or local
Thank you so much Paul. You are life saver.  :)

Sent from my iPhone

> On Oct 31, 2013, at 8:11 PM, "Paul Chavez" <[EMAIL PROTECTED]> wrote:
>
> Here’s a piece of my app server configuration. It’s for IIS logs and has an interceptor to pull a timestamp out of the event data. It’s backed by a fileChannel and I drop files into the spool directory once a minute.
>  
> # SpoolDir source for Weblogs
> appserver.sources.spool_WebLogs.type = spooldir
> appserver.sources.spool_WebLogs.spoolDir = c:\\flume_data\\spool\\web
> appserver.sources.spool_WebLogs.channels = fc_WebLogs
> appserver.sources.spool_WebLogs.batchSize = 1000
> appserver.sources.spool_WebLogs.bufferMaxLines = 1200
> appserver.sources.spool_WebLogs.bufferMaxLineLength = 5000
>  
> appserver.sources.spool_WebLogs.interceptors = add_time
> appserver.sources.spool_WebLogs.interceptors.add_time.type = regex_extractor
> appserver.sources.spool_WebLogs.interceptors.add_time.regex = \\t(\\d{4}-\\d{2}-\\d{2}.\\d{2}:\\d{2})
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers = millis
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers.millis.type = org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers.millis.name = timestamp
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers.millis.pattern = yyyy-MM-dd HH:mm
>  
> Hope that helps,
> Paul Chavez
>  
>  
> From: Siddharth Tiwari [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, October 31, 2013 7:05 PM
> To: [EMAIL PROTECTED]
> Subject: RE: Flume not moving data to HDFS or local
>  
> Can you describe the process to setup spooling directory source ? I am sorry I do not know how to to do that. If you can give me a step by step description on how to configure that and the configuration changes I need to make in my conf to get it done I will be really thankful .. Appreciate your help :)
>
>
> *------------------------*
> Cheers !!!
> Siddharth Tiwari
> Have a refreshing day !!!
> "Every duty is holy, and devotion to duty is the highest form of worship of God.”
> "Maybe other people will try to limit me but I don't limit myself"
>
>
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Date: Thu, 31 Oct 2013 14:38:54 -0700
> Subject: RE: Flume not moving data to HDFS or local
>
> It should commit when one of the various file roll configuration values are hit. There’s a list of them and their defaults in the flume user guide.
>  
> For managing new files on your app servers, the best option right now seems to be a spooling directory source along with some kind of cron jobs that run locally on the app servers to drop files in the spool directory when ready. In my case I run a job that executes a custom script to checkpoint a file that is appended to all day long, creating incremental files every minute to drop in the spool directory.
>  
>  
> From: Siddharth Tiwari [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, October 31, 2013 12:47 PM
> To: [EMAIL PROTECTED]
> Subject: RE: Flume not moving data to HDFS or local
>  
>
> It got resolved it was due to wrong version of guava jar file in flume lib, but still I can see a .tmp extention in teh fiel in HDFS, when does it actually gets commited ? :) ... One another question though What should I change in my configuration file to capture new files being generated in a directory in remote m,achine ?
> Say for example there is one new file generated every hour in my webserver hostlog directory. What do I change in my configuration so that I get teh new file directly in my HDFS compressed ?
>
> *------------------------*
> Cheers !!!
> Siddharth Tiwari
> Have a refreshing day !!!
> "Every duty is holy, and devotion to duty is the highest form of worship of God.”
> "Maybe other people will try to limit me but I don't limit myself"
>
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: RE: Flume not moving data to HDFS or local
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB