Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> picking up new files in Flume NG


Copy link to this message
-
Re: picking up new files in Flume NG
Hey Sadu, your use case is exactly what I'm writing this for. I'm
hoping this patch will get committed within a few days, we're on a
last rev of reviews.

- Patrick

On Tue, Oct 16, 2012 at 10:47 AM, Brock Noland <[EMAIL PROTECTED]> wrote:
> Correct, it's only available in that patch, from the RB it looks like
> it's not too far off from being committed.
>
> Brock
>
> On Tue, Oct 16, 2012 at 12:00 PM, Sadananda Hegde <[EMAIL PROTECTED]> wrote:
>> Yes, It is very similar.
>>
>> The spool directory will keep getting new files. We need to scan through the
>> directory, send the data in the existing files to HDFS , cleanup the files
>> (delete / move/ rename, etc) and scan for new files again. The Spooldir
>> source is not available yet, right?
>>
>> Thanks,
>> Sadu
>>
>>
>> On Tue, Oct 16, 2012 at 10:11 AM, Brock Noland <[EMAIL PROTECTED]> wrote:
>>>
>>> Sounds like https://issues.apache.org/jira/browse/FlUME-1425  ?
>>>
>>> Brock
>>>
>>> On Mon, Oct 15, 2012 at 11:37 PM, Sadananda Hegde <[EMAIL PROTECTED]>
>>> wrote:
>>> > Hello,
>>> >
>>> > I have a scenario where in the client application is continuously
>>> > pushing
>>> > xml messages. Actually the application is writing these messages to
>>> > files
>>> > (new files; same directory). So we will be keep getting new files
>>> > throughout
>>> > the day. I am trying to configure Flume agents on these applcation
>>> > servers
>>> > (4 of them) to pick up the new data and transfer them to HDFS on a
>>> > hadoop
>>> > cluster. How should I configure my source to pick up new files (and
>>> > exclude
>>> > the files that have been processed already)? I don't think Exec source
>>> > with
>>> > tail  -F will work in this scenario because data is not getting added to
>>> > existing files; rather new files get created.
>>> >
>>> > Thank you very much for your time and support.
>>> >
>>> > Sadu
>>>
>>>
>>>
>>> --
>>> Apache MRUnit - Unit testing MapReduce -
>>> http://incubator.apache.org/mrunit/
>>
>>
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/