Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume, mail # user - Avro client in NG as replacement for tail source in OG?


+
Chris Neal 2012-08-13, 17:43
+
Stern, Mark 2012-08-13, 20:03
+
Patrick Wendell 2012-08-13, 23:28
+
Chris Neal 2012-08-14, 14:19
Copy link to this message
-
Re: Avro client in NG as replacement for tail source in OG?
Chris Neal 2012-08-14, 15:00
I could get around the tail -F problem of Solaris with gtail, but that
doesn't address the reliability problem.

I also found this:
http://logging.apache.org/log4j/2.x/manual/appenders.html#FlumeAvroAppenderwhich
has potential, but it's in an alpha release of log4J, which won't fly
in my production environment.

Can anyone comment on the priority/timeframe of a tail source in NG?  :)

Thanks!

On Tue, Aug 14, 2012 at 9:19 AM, Chris Neal <[EMAIL PROTECTED]> wrote:

> Thanks Mark and Patrick,
>
> I had seen the ExecSource, and it's associated caveats :)  That has two
> problems for me:
>
> 1)  It's aforementioned unreliability should the channel have errors, and
> 2)  All my applications run on Solaris, whose tail command does not have a
> -F equivalent from the Linux world. :(
>
> I noticed that the wiki here<https://cwiki.apache.org/confluence/display/FLUME/Features+and+Use+Cases> does
> mention a feature for NG for supporting a tail source, but no JIRA is
> assigned to it yet.  If that page is still up to date, I'm glad to see that
> it's at least on the roadmap!
>
> If anyone can think of any other creative ways to do this, please share.
>
> Much appreciated!
> Chris
>
>
> On Mon, Aug 13, 2012 at 6:28 PM, Patrick Wendell <[EMAIL PROTECTED]>wrote:
>
>> Hey Chris,
>>
>> Mark has got it - for the behavior you are looking, you probably want
>> a tier of flume agents running where you currently have avro clients,
>> using an exec source with "tail -F" as the command.
>>
>> Keep in minding the red warning box in the user guide related to
>> sources like this. If a flume agent restarts, it may re-send or miss
>> certain log entries.
>>
>> - Patrick
>>
>> On Mon, Aug 13, 2012 at 1:03 PM, Stern, Mark <[EMAIL PROTECTED]> wrote:
>> > Use an exec source running 'tail -F'.
>> > ________________________________________
>> > From: Chris Neal [[EMAIL PROTECTED]]
>> > Sent: Monday, August 13, 2012 8:43 PM
>> > To: [EMAIL PROTECTED]
>> > Subject: Avro client in NG as replacement for tail source in OG?
>> >
>> > Hi all.
>> >
>> > I have a very typical configuration:
>> >
>> > Application logs to log4J file.
>> > FlumeNG avro-client watches the file and sends events into FlumeNG
>> Agent Tier 1
>> > FlumeNG Agent Tier 1:  AvroSource to FileChannel to AvroSink
>> > FlumeNG Agent Tier 2: AvroSource to FileChannel to HDFSSink
>> >
>> > I was noticing that after some time, the Tier 1 Agent would disconnect
>> from the avro-client.  What is happening is that the avro-client sends
>> events to the Tier 1 Agent as fast as it can, and when it reaches the end
>> of the file, it exits.  The problem is, the application is still logging to
>> the log4J file, but now all future events are lost because the avro-client
>> has exited.
>> >
>> > I thought the "-F" option to avro-client was like the "-F" option to
>> tail, but after looking at the code, it is not.  There seems to be no
>> "follow" mode for the avro client that I can see.  I then stumbled across
>> this key sentence from here<
>> https://cwiki.apache.org/confluence/display/FLUME/Getting+Started#GettingStarted-flumengavroclientoptions
>> >:
>> >
>> >  Think of the avro-client command as cat for Flume
>> >
>> > So, it's a "cat", not a "tail".
>> >
>> > So I'm wondering, what's the right/best/current way to emulate OG's
>> tailSource?
>> >
>> > Much appreciated.
>> > Chris
>>
>
>
+
Chris Neal 2012-08-14, 15:36
+
Patrick Wendell 2012-08-14, 20:15
+
Chris Neal 2012-08-15, 14:12