Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> ElasticSearchSink - A couple of feature requests


Copy link to this message
-
Re: ElasticSearchSink - A couple of feature requests
Hi all,

Can any of the Flume JIRA admins please assign
https://issues.apache.org/jira/browse/FLUME-2206 ticket to me. I am testing
the changes locally and have a patch I would like to submit for review.

Thanks,
- Dib
On Fri, Oct 4, 2013 at 1:55 PM, Dibyajyoti Ghosh
<[EMAIL PROTECTED]>wrote:

> Thanks Hari.
>
> I am creating JIRA tickets for the improvements.
>
> Best,
> - Dib
>
>
> On Fri, Oct 4, 2013 at 1:45 PM, Hari Shreedharan <
> [EMAIL PROTECTED]> wrote:
>
>>  Hi,
>>
>> I am not too familiar with ElasticSearch. If you want to file a jira,
>> someone might pick it up when they have time.
>>
>>
>> Thanks,
>> Hari
>>
>> On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
>>
>> Hi all,
>>
>> This is a repost from [EMAIL PROTECTED]. I was not sure if flume
>> developers got the email thus pardon my repost if it feels like I am
>> spamming the mailing list.
>>
>> I have a couple of feature requests for ElasticSearchSink and didn't find
>> open JIRA tickets for these requirements.
>>
>> I have already modified ElasticSearchSink locally for the smaller of the
>> feature request and the longer one is in progress. I wanted to discuss the
>> features first with you first before creating the JIRA tickets so here is a
>> brief summary of the improvements I have in mind.
>>
>>
>> DETAILS>>>
>>
>> Flume version:
>>
>> Flume 1.4.0-cdh4.4.0
>> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
>> Revision: 154d35659212f07edc896b414a43996fb8121773
>> Compiled by jenkins on Tue Sep  3 20:53:28 PDT 2013
>> From source with checksum f95b4a7f48080f876d6482bb88bcc342
>>
>> And ElasticSearch v0.90.1.
>> *
>> *
>> *Improvement request #1 - HDFS file suffix style index suffix in
>> ElasticSearchSink:**
>> *
>> *
>> *
>> *agent.sinks.myESsink.indexName = myIndex **
>> *
>> *
>> *
>> ElasticSearchSink uses the provided index name as index prefix and
>> appends "YYYY-MM-DD" to generate the actual index in ES which being
>> convenient for my testing purposes, doesn't allow creating index monthly /
>> yearly or more generally speaking based on some regex provided in flume
>> config similar to HDFS fileSuffix .e.g.
>> *
>> *
>> *agent.sinks.myESsink.indexSuffix = "YYYY"* will create index as
>> myIndex-2013 / myIndex-2014 etc and when not provided will create index
>> with just the index name or can default back to 'YYYY-MM-DD'.
>>
>> *Improvement request #2 - ElasticSearchSink ttl field modification to
>> mimic actual ES:*
>>
>> *agent.sinks.myESsink.ttl = <some integer value> (current specification)*
>>
>> The second one is comparatively trivial but good to have. Current ElasticSearch
>> TTL defaults to 5 days and works with integers only again which is treated
>> as days.
>>
>> It will be good to have a qualifier like "d" / "s" / "m" / "w" / "h" to
>> mimic the TTL configuration in ElasticSearch mapping.
>>
>> *agent.sinks.myESsink.ttl = "3w" / 3 (requested specification)*
>>
>> For the ttl I have already made changes in my local flume git repo and
>> currently testing it. The change doesn't break existing way of specifying
>> TTL field only extends it to allow "1d" / "2w" style TTL specification.
>>
>> <<<DETAILS
>>
>> Kindly suggest what should I do to make these changes incorporated in the
>> future release(s) of Flume.
>>
>> Best and thanks,
>> - Dib
>>
>>
>>
>