Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - A couple of features regarding ElasticSearchSink (ESS)

Copy link to this message
A couple of features regarding ElasticSearchSink (ESS)
Dibyajyoti Ghosh 2013-10-03, 22:32

I am using flume ElasticSearch (ES) sink for my project. The flume version

Flume 1.4.0-cdh4.4.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 154d35659212f07edc896b414a43996fb8121773
Compiled by jenkins on Tue Sep  3 20:53:28 PDT 2013
>From source with checksum f95b4a7f48080f876d6482bb88bcc342

And ElasticSearch v0.90.1.

I am having two issues with the current set of ES configurations allowed
from flume agent.conf

*agent.sinks.myESsink.indexName = myIndex*

*agent.sinks.myESsink.ttl = <some integer value>.*
ElasticSearchSink uses the provided index name as index prefix and appends
"YYYY-MM-DD" to generate the actual index in ES which being convenient for
my testing purposes, doesn't allow creating index monthly / yearly or more
generally speaking based on some regex provided in flume config similar to
HDFS fileSuffix .e.g.

*agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
myIndex-2013 / myIndex-2014 etc and when not provided will create index
with just the index name or can default back to 'YYYY-MM-DD'.

The second one is comparatively trivial but good to have. Current ElasticSearch
TTL defaults to 5 days and works with integers only again which is treated
as days.

It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
mimic the TTL conf in ElasticSearch mapping.

For the second case I have already made changes in my local flume git repo
and currently testing it.

I will start working on the index naming one shortly once I get the easier
of the issues fixed and running in my local deployments.

I didn't find any JIRA tickets for these requirements in Flume Jira and was
wondering how to get these changes in the central flume code base which
will alleviate the pain of maintaining a local flume development branch
while my requirements seemingly has broad applicability.

Please suggest how should I proceed and thank you for bearing with this
long email.

- Dib