Hey Chris - what Steve said is right on:
"Unless you can always guarantee that you will always be able to
continue where you left off and never re-send data then it's probably
best to go right to the logging source and have that piece send
directly to flume (ie, avro, lf4j plugins etc.)."
If you are using an asynchronous source, like tailing, there is always
a possibility of data loss. What if the disk that the log is stored on
fails before flume gets to it? This failure window is inherent in
trying to collect logs like this - and that is what the warning is
Steve - I am working on a tool to read through rolled log files on
disk, send them to a Flume agent, and then rename or delete the
files... would be interested to hear whether you think this could
displace your current perl setup in terms of functionality.
On Thu, Aug 30, 2012 at 8:06 AM, Steve Johnson <[EMAIL PROTECTED]> wrote:
> Chris, I'm testing something similar from the sounds of it. We were
> originally going to go with the idea of using some sort of log tailer to
> pass events (log recs) into the flume agent. Right now, I'm testing using a
> simple perl script that reads a rotated log file, and sends them over the
> network to a flume agent using the NetCat source. This is not ideal, but is
> good enough for some initial Flume testing, which right now, I'm just trying
> to stress test the system.
> When you think about it, the nature of tailing logs is that you really can't
> guarantee delivery anyway. For instance, what happens if you need to take
> your server down, or the tailer fails and you need to restart it, where
> were you at in tailing the log? In my case, it is as bad or worse for us to
> duplicate a logrec as it is to miss them. So tailing itself is a tricky
> thing. Unless you can always guarantee that you will always be able to
> continue where you left off and never re-send data then it's probably best
> to go right to the logging source and have that piece send directly to flume
> (ie, avro, lf4j plugins etc.). However, the downfall there is that if the
> flume agent goes down, your app generating the logs should as well to ensure
> you don't process requests that you can't keep a record of, or at least
> write it smart enough to fall-back to a file when that happens so that you
> can recover them in a batch process later.
> However, if your using this for something like sysloging, error logs,
> monitoring, it's probbaly not that critical if you duplicated or missed some
> logrecs for a short time after a recovery. I guess it really depends on the
> application. I'll be interested to hear your solution though for this, as
> I'm still in the process myself.
> On Thu, Aug 30, 2012 at 9:45 AM, Chris Neal <[EMAIL PROTECTED]> wrote:
>> Hi Patrick,
>> My issue with ExecSource is the giant warning in the user guide:
>> The problem with ExecSource and other asynchronous sources is that the
>> source can not guarantee that if there is a failure to put the event into
>> the Channel the client knows about it. In such cases, the data will be lost.
>> As a for instance, one of the most commonly requested features is the tail
>> -F [file]-like use case where an application writes to a log file on disk
>> and Flume tails the file, sending each line as an event. While this is
>> possible, there’s an obvious problem; what happens if the channel fills up
>> and Flume can’t send an event? Flume has no way of indicating to the
>> application writing the log file that it needs to retain the log or that the
>> event hasn’t been sent, for some reason. If this doesn’t make sense, you
>> need only know this: Your application can never guarantee data has been
>> received when using a unidirectional asynchronous interface such as
>> ExecSource! As an extension of this warning - and to be completely clear -
>> there is absolutely zero guarantee of event delivery when using this source.