Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> doubt in exec source specifically in tail -F

Copy link to this message
doubt in exec source specifically in tail -F

In Flume-ng is there any way using exec (tail -F) as the source to get
only the new lines  which are being added to the log file ?
(i.e. there is a growing log file and we want to transfer all the logs
using flume
without duplication of logs)

I understand if something fails and as tail doesn't maintain state we
will have duplicates.
But we are not considering failovers as of now.

So I think "tail -F" is useful only in scenarios where sink or any
agent can remove duplicates. Is it correct?

But as tail looks like quite a popular source in flume I thought I might
be missing
Presently using "tail -F <file>" as the source to read from the log file
leads to
scenarios like this:

1. If file has not  changed for a while, but tail still tails file every
second and then prints the same lines again (depending upon -n option)
2. Even if file grows then using tail we can't quite control which lines
we want?