Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Flume to stream logs live

Kartashov, Andy 2012-12-14, 14:47
Mohammad Tariq 2012-12-14, 14:51
Brock Noland 2012-12-14, 14:52
Copy link to this message
RE: Flume to stream logs live
I have setup a Windows flume flow using LogParser and the AvroClient app bundled with flume.

It's a Powershell script scheduled every 5 minutes which runs a checkpointed query via LogParser to create incremental files for IIS logs and a couple other of our app logs. Then the incremental files are sent to a flume node running AvroSource. From there it's a typical flume setup, the log types are split based on a header that I append when sending via the AvroClient and then sent to collector nodes that sink to HDFS.

It's currently a best effort architecture as I don't trap any errors from the AvroClient on the Windows side. I did extend the AvroClient to kick out exit codes though, just not using it yet (see https://issues.apache.org/jira/browse/FLUME-1670). I've been sending about 15GB of IIS logs per day per server without issues, though.

It's not the best solution but it works for now. Longer term we are thinking of a custom app on our side that leverages the HTTPSource, or if we get ambitious implementing the AvroRPC in .net but that's a backburner project right now.

Also, I'm bucketing the events based on a timestamp interceptor which has caused post processing pain as the event timestamps are off by ~5 minutes from the header. I'm looking forward to using regex capture interceptor to timestamp the events with the event time soon.

Paul Chavez

-----Original Message-----
From: Brock Noland [mailto:[EMAIL PROTECTED]]
Sent: Friday, December 14, 2012 6:52 AM
Subject: Re: Flume to stream logs live


FWIW, I was sending log data from Windows I would write a little Windows Log Agent and send the data to the HTTP Source.


On Fri, Dec 14, 2012 at 8:47 AM, Kartashov, Andy <[EMAIL PROTECTED]> wrote:
> Flummers,
> Loved working with Flume 1.2 - very easy and simple configuration, it
> was a pleasure to work with. Managed to "tail -F" logs from unix
> server and into a hdfs cluster. The problem started when I also needed
> to push logs from a Windows application server.  Spent three days
> researching on how to install flume on Windows and run  a deamon/agent
> that will push the logs to the Avro source I successfully configured
> and ran on Unix. No luck. So I am looking t alternative. Is there
> other framework available out there to help me with my issue. What about scribe?
> Andy Kartashov
> IT Architecture, Co-op
> 1340 Pickering Parkway, Pickering, L1V 0C4
> ( Phone : (905) 837 6269
> ( Mobile: (416) 722 1787
> NOTICE: This e-mail message and any attachments are confidential,
> subject to copyright and may be privileged. Any unauthorized use,
> copying or disclosure is prohibited. If you are not the intended
> recipient, please delete and contact the sender immediately. Please
> consider the environment before printing this e-mail. AVIS : le
> présent courriel et toute pièce jointe qui l'accompagne sont
> confidentiels, protégés par le droit d'auteur et peuvent être couverts
> par le secret professionnel. Toute utilisation, copie ou divulgation
> non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
> Veuillez penser à l'environnement avant d'imprimer le présent courriel

Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
Bertrand Dechoux 2012-12-16, 12:25
Juhani Connolly 2012-12-17, 01:33
Alexander Lorenz 2012-12-17, 07:57
Kartashov, Andy 2012-12-19, 16:20
Alexander Alten-Lorenz 2012-12-20, 07:02
Kartashov, Andy 2012-12-17, 15:50
Kartashov, Andy 2012-12-20, 17:44