Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Flume Jambalaya - A Flume Plugin with Multiple Components


Copy link to this message
-
Flume Jambalaya - A Flume Plugin with Multiple Components
Israel Ekpo 2014-05-02, 21:18
Flume Community,

I created a Flume Plugin with multiple components that complements the
current version of Apache Flume.

This was necessary as part of a personal project as I working on.

It is code named - Flume Jambalaya

Jambalaya is a standalone Apache Flume plugin that contains a variety of
sources, interceptors, channels, sinks, serializers and other components
designed to extend the Flume architecture. It has been released under the
Apache License version 2.0

https://github.com/aicer/flume-jambalaya

It currently contains:

(a) File Source - This source lets you ingest data by tailing files from a
specific path
(b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
cluster via HTTP with no dependency on the ElasticSearch versions between
Flume and the Server cluster.
(c) DateInterceptor - The date interceptor is used for parsing dates from
fields and using that date or timestamp as the timestamp for the Flume
event.
(d) Grok Interceptor - allows you to extract structured data from
unstructured text and inject them as headers into the event

Sample configuration files are available here

https://github.com/aicer/flume-jambalaya/tree/master/sample-configuration-files

I did not realize that the Flume trunk already has a HTTP Sink for
ElasticSearch so you can decide whether or not to use the sink that comes
with it

I am still testing and integrating the various components.

Please check it out when you get a chance and send me some feedback

Thanks.