Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Dynamic Key=Value Parsing with an Interceptor?


+
Matt Wise 2013-11-10, 01:28
+
Matt Wise 2013-11-11, 18:03
+
Paul Chavez 2013-11-11, 23:09
Copy link to this message
-
Re: Dynamic Key=Value Parsing with an Interceptor?
Paul,
  Thanks for the feedback. I looked briefly at Morphline, but wasn't sure
if it was what I needed. I will take a deeper dive this week and see if it
will do what we want. Ultimately the reason we're not changing the apps is
that we honestly don't always have a lot of control. Many of the apps are
3rd party apps where we just barely have the ability to adjust their
log-line-formats.

Matt Wise
Sr. Systems Architect
Nextdoor.com
On Mon, Nov 11, 2013 at 3:09 PM, Paul Chavez <
[EMAIL PROTECTED]> wrote:

> I think there may be two ‘out of box’ ways to do this kind of thing. First
> would be using the regex extract interceptor with multiple serializers
> keying on various fields. However that’s not really dynamic and just kind
> of a half-step better from one interceptor for each field as you mentioned.
> Second would be to use the morphline interceptor to parse your event body
> and insert headers as needed. I have to admit I have no experience with
> this interceptor but in reading the documentation it seems designed for
> this kind of use case.
>
>
>
> Ultimately though, when faced with this we opted to push this into the app
> layer. Is there a reason the applications can’t write these key/value pairs
> as headers in the first place? We use an HTTP source and when we wrote the
> logging class for it on our app side we put similar functionality in as
> category/subcategory headers. Then flume doesn’t have to have any special
> interceptors beyond a default static one in case the headers are completely
> missing, and we write to HDFS with tokenized paths so each permutation of
> those headers gets a separate directory.
>
>
>
> If you continue to explore this issue, please keep us updated. I
> especially would like to hear some real world morphline examples.
>
>
>
> Hope that helps,
>
> Paul Chavez
>
>
>
>
>
> *From:* Matt Wise [mailto:[EMAIL PROTECTED]]
> *Sent:* Monday, November 11, 2013 10:04 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Dynamic Key=Value Parsing with an Interceptor?
>
>
>
> Anyone have any ideas on the best way to do this?
>
>
> Matt Wise
>
> Sr. Systems Architect
>
> Nextdoor.com
>
>
>
> On Sat, Nov 9, 2013 at 5:28 PM, Matt Wise <[EMAIL PROTECTED]> wrote:
>
> Hey we'd like to set up a default format for all of our logging systems...
> perhaps looking like this:
>
>
>
>   "key1=value1;key2=value2;key3=value3...."
>
>
>
> With this pattern, we'd allow developers to define any key/value pairs
> they want to log, and separate them with a common separator.
>
>
>
> If we did this, what do we need to do in Flume to get Flume to parse out
> the key=value pairs into dynamic headers? We pass our data from Flume into
> both HDFS and ElasticSearch sinks. We would really like to have these
> fields dynamically sent to the sinks for much easier parsing and analysis
> later.
>
>
>
> Any thoughts on this? I know that we can define a unique interceptor for
> each service that looks for explicit field names ... but thats a nightmare
> to manage. I really want something truly dynamic.
>
>
> Matt Wise
>
> Sr. Systems Architect
>
> Nextdoor.com
>
>
>
+
Wolfgang Hoschek 2013-11-13, 00:44