Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Event breaking in flume


+
Chhaya Vishwakarma 2013-12-30, 09:53
+
Ashish 2013-12-30, 10:17
+
Chhaya Vishwakarma 2013-12-30, 10:26
+
Ashish 2013-12-30, 10:34
+
Joao Salcedo 2013-12-30, 10:51
+
Chhaya Vishwakarma 2013-12-30, 10:56
+
Joao Salcedo 2013-12-30, 11:05
+
Brock Noland 2013-12-30, 14:17
+
Chhaya Vishwakarma 2013-12-31, 03:54
+
Ashish 2013-12-31, 04:22
+
Chhaya Vishwakarma 2013-12-31, 06:49
+
Brock Noland 2013-12-31, 14:54
Copy link to this message
-
RE: Event breaking in flume
Hi

Flume source by default sends one line from file as one event? What exactly the use of interceptors ? if I will use morphline interceptor will it send multiple line?
Regards,
Chhaya Vishwakarma

From: Brock Noland [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, December 31, 2013 8:25 PM
To: [EMAIL PROTECTED]
Subject: Re: Event breaking in flume

You you'd need to do Java. If you want to use Python, I would use the second solution I posted earlier.

"Another solution is:

1) replace new lines with something like __NL__ by a perl script in your exec source
2) Use morphlines to replace __NL__ with \n"
On Tue, Dec 31, 2013 at 12:49 AM, Chhaya Vishwakarma <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
How about using python ?

From: Ashish [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Tuesday, December 31, 2013 9:53 AM

To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: Event breaking in flume

Have a look at org.apache.flume.serialization.LineDeserializer in flume-ng-core module

On Tue, Dec 31, 2013 at 9:24 AM, Chhaya Vishwakarma <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi brock

Thanks. Using spooling directory with deserializer looks good however i don't have any idea of how to write custom deserializer.
Can you give me little hint how should i go about writing my own deserializer it will be a great help.
Regards,
Chhaya Vishwakarma

From: Brock Noland [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Monday, December 30, 2013 7:48 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: Event breaking in flume

Yes, it is possible to handle multi-line events and handling stack traces is very common place.

However, using exec source is going to be limiting. The "correct" solution is:

1) Use spooling directory source
2) Write a little deserializer to handle your format.

Another solution is:

1) replace new lines with something like __NL__ by a perl script in your exec source
2) Use morphlines to replace __NL__ with \n

A third and less desirable solution would be:

1) Use the morphlines intercepter to merge multiple events to a single event. This will not work well for a varity or reasons but the most common being that the exec source could hit it's "batch" size in the middle of of a stack trace in which case the stack trace will be in to different batches.

Brock
On Mon, Dec 30, 2013 at 5:05 AM, Joao Salcedo <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Looks that it is possible based on regular expression pattern matching

http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/readMultiLine

On Mon, Dec 30, 2013 at 9:56 PM, Chhaya Vishwakarma <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
So is it not possible to handle multiline events in flume?

From: Joao Salcedo [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Monday, December 30, 2013 4:22 PM

To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: Event breaking in flume

Maybe you can set up some morphlines and do some ETL in your event.

I hope this help you.

http://blog.cloudera.com/blog/2013/07/morphlines-the-easy-way-to-build-and-integrate-etl-apps-for-apache-hadoop/

Cheers

On Mon, Dec 30, 2013 at 9:34 PM, Ashish <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
I am not aware of any options out of the box. Maybe someone else can help.
Alternate way is to write a custom source.

On Mon, Dec 30, 2013 at 3:56 PM, Chhaya Vishwakarma <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi
Exec as source and tail command
From: Ashish [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Monday, December 30, 2013 3:48 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Re: Event breaking in flume

What is the Source you are using?

On Mon, Dec 30, 2013 at 3:23 PM, Chhaya Vishwakarma <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi,

By default flume considers one line as one event, But I want to do breaking on some other criteria how it can be achieved in flume? Is it possible to do ?

10 Sep 2013 19:43:33,561 [WebContainer : 9] ERROR - An Error has occured for com.marsh.framework.core.exception.MarshException: Record has been modified since last retrieved - Resubmit transaction

10 Sep 2013 19:43:33,561 [WebContainer : 9] ERROR - handleException():com.marsh.framework.core.exception.MarshException: Record has been modified since last retrieved - Resubmit transaction
     at com.marsh.csa.serviceagreement.ServiceAgreementImpl.updateAgreement(ServiceAgreementImpl.java(Compiled Code))
     at com.marsh.csa.serviceagreementmgmt.CSAManagerImpl.updateCSA(CSAManagerImpl.java(Compiled Code))
     at com.marsh.csa.serviceagreementmgmt.ejb.EJSRemoteStatelessServiceagreementManager_3dcfd156.updateCSA(Unknown Source)
     at com.marsh.csa.serviceagreementmgmt.ejb._ServiceagreementManagerRemote_Stub.updateCSA(_ServiceagreementManagerRemote_Stub.java(Compiled Code))
     at com.marsh.csa.proxy.CSAProxy.updateCSA(CSAProxy.java(Compiled Code))
     at com.marsh.csa.serviceagreement.SaveCSAAction.performAction(SaveCSAAction.java(Compiled Code))
     at com.marsh.csa.serviceagreement.CSAAbstractStrutsAction.execute(CSAAbstractStrutsAction.java(Compiled Code))
     at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java(Inlined Compiled Code))
     at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java(Compiled Code))
Caused by: com.marsh.framework.core.exception.MarshException: Record has been modified since last retrieved - Resubmit transaction
     at com.marsh.csa.serviceagreement.ServiceAgreementDAO.updateServiceAgreement(ServiceAgreementDAO.java(Compiled Code))
     at com.marsh.csa.serviceagreement.Servi
+
Christopher Shannon 2013-12-30, 16:48
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB