Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Would someone please comment on Tail Source in NG?


Copy link to this message
-
Re: Would someone please comment on Tail Source in NG?
Hi Chris,

A few months back I actually ported the original flumes tail source, but
it was decided(and I agree with the reasoning) not to include it for a
number of reasons, which can be seen on the original ticket at
https://issues.apache.org/jira/browse/FLUME-931 . One of the big ones is
the fact that java cannot access inode information.

What we do is have a python program that tracks the files in a directory
and then sends the data using the scribe format to the ScribeSource(we
were using scribe until switching to flume, so are just using our ingest
system from then). This allows for the freedom to customize the ingest
to our own expectations, and we write checkpoints of how far we have
tailed. You could write this in whatever language you're comfortable
with and pass the data via avro or thrift.

On 08/30/2012 01:18 AM, Chris Neal wrote:
> Hey guys,
>
> I'm sure this is not a new question, but I haven't found an answer in
> my searches.  I'm curious why there is as of yet no Tail Source with
> NG?  It seems one of the most common use cases for Flume is to tail a
> log file and dump it "somewhere".  Given that, it sure would seem that
> a Tail Source would be one of the first sources that gets written with
> a new version.
>
> I know about all the other ways to implement something *like* a Tail
> Source:  Exec Source, AVRO, Log4Jappender...  and unfortunately they
> all have limitations with regards to either functionality or
> reliability/recoverability.
>
> What am I missing here?
>
> Is there any work being done on a Tail Source for NG?
>
> I promise I'm not complaining, just trying to understand the logic. :)
>
> Much appreciated.
> Chris