I wouldn't modify the files while flume is also modifying them. It
might work but also might be a complete mess. If you need to modify
the events before being written interceptors are the correct solution.
After the file is done from a flume perspective, modify all you wish!
On Fri, Dec 21, 2012 at 2:26 PM, Cochran, David <[EMAIL PROTECTED]> wrote:
> just had a thought... before I turn this script up and make a mess of things
> I figured I'd ask the group...
> I'm running FLUME 1.3 running using FILE_ROLL at the sink.... the 'live in
> use' files are being periodically scanned for key events while still "live'
> and being appending to by Flume... no problems there as they are just being
> now the interesting part, I also need to do a little processing of the
> stored logs (using sed) to insert a couple pieces of data into each line (if
> it doesn't already exist) before my log scanner process does it's thing.
> I'm not sure what the odds are of this NOT totally hosing the flume
> process/data will be...maybe recognizes the file is in use and waits? The
> files are processed by sed pretty quickly ( ~15 secs) as they are rotated
> Has anyone else tried this yet or have any insight as to how Flume might
> react before I attempt to make bit soup?
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/