Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> ElasticSearchSink: should we combined the proposed interfaces for event serialization and id assignment


Copy link to this message
-
ElasticSearchSink: should we combined the proposed interfaces for event serialization and id assignment
Hi all,
(I read this list in digest mode; would you mind ccing me on any replies?)

I've got two patches progressing through Jira (FLUME-1782, FLUME-1972).
-1782 fixes a defect where the wrong timestamp field and elasticsearch
index name are used. -1972 adds an interface which users can implement to
assign an id instead of letting elasticsearch randomly assign on.

The question to discuss: should we(I) combine those interfaces and just
have a single interface.

Mike Percy has kindly reviewed FLUME-1782 and the knock-on effect of his
comments require that the ElasticSearchEventSerializer interface be changed
- and thus this becomes a breaking change. I had been attempting to avoid
that and this is why -1972 has a new interface.

If we're going to break the interface then maybe we should go all the way
and put the new id provider functionality on to it as well? We could also
rename it to ElasticsearchEventSerializer (lower case s on search) to match
the way the maintainer of elasticsearch spells it.

The new interface would be:
public interface ElasticSearchEventSerializer extends Configurable,
    ConfigurableComponent {

  static final Charset charset = Charset.defaultCharset();
  XContentBuilder getContentBuilder(Event event, Long timestampOverride)
throws IOException;

  String getId(Event event)
}

The timestampOverride would only be non-null if there was no timestamp
header.

Thoughts?

Cheers,
Edward
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB