|
|
+
Cochran, David 2012-12-21, 20:26
-
Re: post-processingBrock Noland 2012-12-21, 20:46
I wouldn't modify the files while flume is also modifying them. It
might work but also might be a complete mess. If you need to modify the events before being written interceptors are the correct solution. After the file is done from a flume perspective, modify all you wish! On Fri, Dec 21, 2012 at 2:26 PM, Cochran, David <[EMAIL PROTECTED]> wrote: > just had a thought... before I turn this script up and make a mess of things > I figured I'd ask the group... > > I'm running FLUME 1.3 running using FILE_ROLL at the sink.... the 'live in > use' files are being periodically scanned for key events while still "live' > and being appending to by Flume... no problems there as they are just being > read.... > > now the interesting part, I also need to do a little processing of the > stored logs (using sed) to insert a couple pieces of data into each line (if > it doesn't already exist) before my log scanner process does it's thing. > > I'm not sure what the odds are of this NOT totally hosing the flume > process/data will be...maybe recognizes the file is in use and waits? The > files are processed by sed pretty quickly ( ~15 secs) as they are rotated > daily. > > Has anyone else tried this yet or have any insight as to how Flume might > react before I attempt to make bit soup? > > Thanks, > -Dave -- Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/ |