Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # dev >> Re: ElasticSearchSink - A couple of feature requests


+
Edward Sargisson 2013-10-27, 15:03
Copy link to this message
-
Re: ElasticSearchSink - A couple of feature requests
Hi Edward,

Thank you for reaching out. Among the two features I requested for
ElasticSearch sink in flume I have implemented the smaller one (
https://issues.apache.org/jira/browse/FLUME-2206) which will allow users to
provide TTL values with day / hour / week etc. specifier as in current
version of ElasticSearch and have posted the patch for review here (
https://reviews.apache.org/r/14614/).

The second / bigger one is where users will be able to provide
ElasticSearch index naming with / without rolling specifier as opposed to
the current way where user provides the ElasticSearch indexName e.g. say
"flume" and ElasticSearchSink appends %daytimestamp i.e. 2013-10-28 to
create index "flume-2013-10-28" and keeps on creating indexes on a daily
basis. While the current way of creating indices works great under
circumstances where user wants to roll indices on a daily basis it
constrains the user from creating indices on monthly basis i.e.
"flume-2013-10" or "flume-2013-11" etc. or yearly basis, so on and so
forth. Essentially I was looking for HDFS filePrefix style ElasticSearch
index naming. I haven't yet started working on this patch. Please go ahead
if you want to work on this feature request. I have already created a JIRA
ticket (https://issues.apache.org/jira/browse/FLUME-2207) for this one.

Best,
- Dib
On Sun, Oct 27, 2013 at 8:03 AM, Edward Sargisson <[EMAIL PROTECTED]> wrote:

> Hi Dib,
> I seem to spend the most time maintaining the Elasticsearch Sink and,
> sadly, am *way* behind on email.
>
> If you raise Jira issues for your proposed changes and set up the Review
> board for them then either I or a colleague should be able to take a look.
> Normally, once we're happy, a committer will commit them to the repository.
>
> I will note that I'm on parental leave until Dec 9 and won't have a chance
> to have a look until then. However, when everything's ready drop me an
> email and I'll see if a colleague has time.
>
> Cheers,
> Edward
>
> "On Friday, October 4, 2013 at 12:14 PM, Dibyajyoti Ghosh wrote:
>
> > Hi all,
> >
> > This is a repost from [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]
> ).
> I was not sure if flume developers got the email thus pardon my repost if
> it feels like I am spamming the mailing list.
> >
> > I have a couple of feature requests for ElasticSearchSink and didn't find
> open JIRA tickets for these requirements.
> >
> > I have already modified ElasticSearchSink locally for the smaller of the
> feature request and the longer one is in progress. I wanted to discuss the
> features first with you first before creating the JIRA tickets so here is a
> brief summary of the improvements I have in mind.
> >
> >
> > DETAILS>>>
> >
> > Flume version:
> >
> > Flume 1.4.0-cdh4.4.0
> > Source code repository:
> https://git-wip-us.apache.org/repos/asf/flume.git
> > Revision: 154d35659212f07edc896b414a43996fb8121773
> > Compiled by jenkins on Tue Sep  3 20:53:28 PDT 2013
> > From source with checksum f95b4a7f48080f876d6482bb88bcc342
> >
> >
> > And ElasticSearch v0.90.1.
> >
> > Improvement request #1 - HDFS file suffix style index suffix in
> ElasticSearchSink:
> >
> > agent.sinks.myESsink.indexName = myIndex
> >
> > ElasticSearchSink uses the provided index name as index prefix and
> appends "YYYY-MM-DD" to generate the actual index in ES which being
> convenient for my testing purposes, doesn't allow creating index monthly /
> yearly or more generally speaking based on some regex provided in flume
> config similar to HDFS fileSuffix .e.g.
> >
> > agent.sinks.myESsink.indexSuffix = "YYYY" will create index as
> myIndex-2013 / myIndex-2014 etc and when not provided will create index
> with just the index name or can default back to 'YYYY-MM-DD'.
> >
> > Improvement request #2 - ElasticSearchSink ttl field modification to
> mimic actual ES:
> >
> > agent.sinks.myESsink.ttl = <some integer value> (current specification)
> >
> > The second one is comparatively trivial but good to have. Current
> ElasticSearch TTL defaults to 5 days and works with integers only again
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB