Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - ElasticSearchSink - A couple of feature requests


Copy link to this message
-
Re: ElasticSearchSink - A couple of feature requests
Dibyajyoti Ghosh 2013-10-04, 20:55
Thanks Hari.

I am creating JIRA tickets for the improvements.

Best,
- Dib
On Fri, Oct 4, 2013 at 1:45 PM, Hari Shreedharan
<[EMAIL PROTECTED]>wrote:

>  Hi,
>
> I am not too familiar with ElasticSearch. If you want to file a jira,
> someone might pick it up when they have time.
>
>
> Thanks,
> Hari
>
> On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
>
> Hi all,
>
> This is a repost from [EMAIL PROTECTED]. I was not sure if flume
> developers got the email thus pardon my repost if it feels like I am
> spamming the mailing list.
>
> I have a couple of feature requests for ElasticSearchSink and didn't find
> open JIRA tickets for these requirements.
>
> I have already modified ElasticSearchSink locally for the smaller of the
> feature request and the longer one is in progress. I wanted to discuss the
> features first with you first before creating the JIRA tickets so here is a
> brief summary of the improvements I have in mind.
>
>
> DETAILS>>>
>
> Flume version:
>
> Flume 1.4.0-cdh4.4.0
> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
> Revision: 154d35659212f07edc896b414a43996fb8121773
> Compiled by jenkins on Tue Sep  3 20:53:28 PDT 2013
> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>
> And ElasticSearch v0.90.1.
> *
> *
> *Improvement request #1 - HDFS file suffix style index suffix in
> ElasticSearchSink:**
> *
> *
> *
> *agent.sinks.myESsink.indexName = myIndex **
> *
> *
> *
> ElasticSearchSink uses the provided index name as index prefix and appends
> "YYYY-MM-DD" to generate the actual index in ES which being convenient for
> my testing purposes, doesn't allow creating index monthly / yearly or more
> generally speaking based on some regex provided in flume config similar to
> HDFS fileSuffix .e.g.
> *
> *
> *agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
> myIndex-2013 / myIndex-2014 etc and when not provided will create index
> with just the index name or can default back to 'YYYY-MM-DD'.
>
> *Improvement request #2 - ElasticSearchSink ttl field modification to
> mimic actual ES:*
>
> *agent.sinks.myESsink.ttl = <some integer value> (current specification)*
>
> The second one is comparatively trivial but good to have. Current ElasticSearch
> TTL defaults to 5 days and works with integers only again which is treated
> as days.
>
> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
> mimic the TTL configuration in ElasticSearch mapping.
>
> *agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)*
>
> For the ttl I have already made changes in my local flume git repo and
> currently testing it. The change doesn't break existing way of specifying
> TTL field only extends it to allow "1d" / "2w" style TTL specification.
>
> <<<DETAILS
>
> Kindly suggest what should I do to make these changes incorporated in the
> future release(s) of Flume.
>
> Best and thanks,
> - Dib
>
>
>