I created a Flume Plugin with multiple components that complements the
current version of Apache Flume.
This was necessary as part of a personal project as I working on.
It is code named - Flume Jambalaya
Jambalaya is a standalone Apache Flume plugin that contains a variety of
sources, interceptors, channels, sinks, serializers and other components
designed to extend the Flume architecture. It has been released under the
Apache License version 2.0
It currently contains:
(a) File Source - This source lets you ingest data by tailing files from a
(b) ElasticSearch HTTP Sink - This sink sends events to an ElasticSearch
cluster via HTTP with no dependency on the ElasticSearch versions between
Flume and the Server cluster.
(c) DateInterceptor - The date interceptor is used for parsing dates from
fields and using that date or timestamp as the timestamp for the Flume
(d) Grok Interceptor - allows you to extract structured data from
unstructured text and inject them as headers into the event
Sample configuration files are available here
I did not realize that the Flume trunk already has a HTTP Sink for
ElasticSearch so you can decide whether or not to use the sink that comes
I am still testing and integrating the various components.
Please check it out when you get a chance and send me some feedback