Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume not moving data to HDFS or local


Copy link to this message
-
Re: Flume not moving data to HDFS or local
Thank you so much Paul. You are life saver.  :)

Sent from my iPhone

> On Oct 31, 2013, at 8:11 PM, "Paul Chavez" <[EMAIL PROTECTED]> wrote:
>
> Here’s a piece of my app server configuration. It’s for IIS logs and has an interceptor to pull a timestamp out of the event data. It’s backed by a fileChannel and I drop files into the spool directory once a minute.
>  
> # SpoolDir source for Weblogs
> appserver.sources.spool_WebLogs.type = spooldir
> appserver.sources.spool_WebLogs.spoolDir = c:\\flume_data\\spool\\web
> appserver.sources.spool_WebLogs.channels = fc_WebLogs
> appserver.sources.spool_WebLogs.batchSize = 1000
> appserver.sources.spool_WebLogs.bufferMaxLines = 1200
> appserver.sources.spool_WebLogs.bufferMaxLineLength = 5000
>  
> appserver.sources.spool_WebLogs.interceptors = add_time
> appserver.sources.spool_WebLogs.interceptors.add_time.type = regex_extractor
> appserver.sources.spool_WebLogs.interceptors.add_time.regex = \\t(\\d{4}-\\d{2}-\\d{2}.\\d{2}:\\d{2})
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers = millis
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers.millis.type = org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers.millis.name = timestamp
> appserver.sources.spool_WebLogs.interceptors.add_time.serializers.millis.pattern = yyyy-MM-dd HH:mm
>  
> Hope that helps,
> Paul Chavez
>  
>  
> From: Siddharth Tiwari [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, October 31, 2013 7:05 PM
> To: [EMAIL PROTECTED]
> Subject: RE: Flume not moving data to HDFS or local
>  
> Can you describe the process to setup spooling directory source ? I am sorry I do not know how to to do that. If you can give me a step by step description on how to configure that and the configuration changes I need to make in my conf to get it done I will be really thankful .. Appreciate your help :)
>
>
> *------------------------*
> Cheers !!!
> Siddharth Tiwari
> Have a refreshing day !!!
> "Every duty is holy, and devotion to duty is the highest form of worship of God.”
> "Maybe other people will try to limit me but I don't limit myself"
>
>
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Date: Thu, 31 Oct 2013 14:38:54 -0700
> Subject: RE: Flume not moving data to HDFS or local
>
> It should commit when one of the various file roll configuration values are hit. There’s a list of them and their defaults in the flume user guide.
>  
> For managing new files on your app servers, the best option right now seems to be a spooling directory source along with some kind of cron jobs that run locally on the app servers to drop files in the spool directory when ready. In my case I run a job that executes a custom script to checkpoint a file that is appended to all day long, creating incremental files every minute to drop in the spool directory.
>  
>  
> From: Siddharth Tiwari [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, October 31, 2013 12:47 PM
> To: [EMAIL PROTECTED]
> Subject: RE: Flume not moving data to HDFS or local
>  
>
> It got resolved it was due to wrong version of guava jar file in flume lib, but still I can see a .tmp extention in teh fiel in HDFS, when does it actually gets commited ? :) ... One another question though What should I change in my configuration file to capture new files being generated in a directory in remote m,achine ?
> Say for example there is one new file generated every hour in my webserver hostlog directory. What do I change in my configuration so that I get teh new file directly in my HDFS compressed ?
>
> *------------------------*
> Cheers !!!
> Siddharth Tiwari
> Have a refreshing day !!!
> "Every duty is holy, and devotion to duty is the highest form of worship of God.”
> "Maybe other people will try to limit me but I don't limit myself"
>
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: RE: Flume not moving data to HDFS or local