Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> Review Request: FLUME-1657 - Regex Extractor Interceptor


Copy link to this message
-
Re: Review Request: FLUME-1657 - Regex Extractor Interceptor


> On Nov. 6, 2012, 5:05 p.m., Brock Noland wrote:
> > flume-ng-doc/sphinx/FlumeUserGuide.rst, line 1824
> > <https://reviews.apache.org/r/7700/diff/1/?file=178736#file178736line1824>
> >
> >     Generally in flume when we are going to have multiple sub property items we give them a name and then have sub-properties under that name. Thoughts on doing that hear as opposed to using the class name?
>
> Cameron Gandevia wrote:
>     The idea I saw was allowing anyone to provide their own serializer implementation outside of the ones provided by the flume project, maybe I am not following how this would be done using sub property names.
>
> Brock Noland wrote:
>     Yes I think that is great a great idea!
>    
>     So what I was thinking is that this:
>    
>     agent.sources.r1.channels = c1<p>
>     agent.sources.r1.type = SEQ<p>
>     agent.sources.r1.interceptors = i1<p>
>     agent.sources.r1.interceptors.i1.type = REGEX_EXTRACTOR<p>
>     agent.sources.r1.interceptors.i1.regex = (WARNING)|(ERROR)|(FATAL)<p>
>     agent.sources.r1.interceptors.i1.serializer = warning:com.blah.SomeSerializer,error,fatal:org.apache.flume.interceptor.RegexExtractorInterceptorTimestampSerializer<p>
>     agent.sources.r1.interceptors.i1.org.apache.flume.interceptor.RegexExtractorInterceptorTimestampSerializer.dateFormat = yyyy-MM-dd
>    
>     becomes something  approximately like this:
>    
>     agent.sources.r1.channels = c1<p>
>     agent.sources.r1.type = SEQ<p>
>     agent.sources.r1.interceptors = i1<p>
>     agent.sources.r1.interceptors.i1.type = REGEX_EXTRACTOR<p>
>     agent.sources.r1.interceptors.i1.regex = (WARNING)|(ERROR)|(FATAL)<p>
>     agent.sources.r1.interceptors.i1.serializers = warning:s1,error,fatal:s1<p>
>     agent.sources.r1.interceptors.i1.serializers.s1.type = org.apache.flume.interceptor.RegexExtractorInterceptorTimestampSerializer
>     agent.sources.r1.interceptors.i1.serializers.s1.dateFormat = yyyy-MM-dd
>     agent.sources.r1.interceptors.i1.serializers.s2.type = com.blah.SomeSerializer
>    
>     That is how other plugabble components are configured. Then s1.* can be passed to the a configure method in the serializer and the plugin can do it's own configuration.
>
> Hari Shreedharan wrote:
>     I like Brock's idea. I'd also change this to be more uniform with other Flume components to something like this:
>     agent.sources.r1.interceptors.i1.serializers = s1 s2
>     agent.sources.r1.interceptors.i1.serializers.s1.type = org.apache.flume.interceptor.RegexExtractorInterceptorTimestampSerializer
>     agent.sources.r1.interceptors.i1.serializers.s1.dateFormat = yyyy-MM-dd
>     agent.sources.r1.interceptors.i1.serializers.s1.severity = warning
>     agent.sources.r1.interceptors.i1.serializers.s2.type = com.blah.SomeSerializer
>     agent.sources.r1.interceptors.i1.serializers.s2.severity = error fatal

+1
- Brock
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7700/#review13156
-----------------------------------------------------------
On Nov. 14, 2012, 1:24 a.m., Cameron Gandevia wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7700/
> -----------------------------------------------------------
>
> (Updated Nov. 14, 2012, 1:24 a.m.)
>
>
> Review request for Flume.
>
>
> Description
> -------
>
> A RegexExtractor interceptor that will allow users to extract regex matches and append them as header fields of the event.
>
>
> Diffs
> -----
>
>   flume-ng-core/src/main/java/org/apache/flume/interceptor/InterceptorType.java c478337
>   flume-ng-core/src/main/java/org/apache/flume/interceptor/RegexExtractorInterceptor.java PRE-CREATION
>   flume-ng-core/src/main/java/org/apache/flume/interceptor/RegexExtractorInterceptorMillisSerializer.java PRE-CREATION
>   flume-ng-core/src/main/java/org/apache/flume/interceptor/RegexExtractorInterceptorPassThroughSerializer.java PRE-CREATION