Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # dev >> Review Request: FLUME-1782 Elastic Search sink does not use UTC to determine the correct index to write to


+
Edward Sargisson 2013-04-09, 17:59
+
Edward Sargisson 2013-04-09, 17:59
+
Israel Ekpo 2013-04-11, 10:23
+
Edward Sargisson 2013-04-11, 18:58
+
Israel Ekpo 2013-04-11, 10:28
+
Edward Sargisson 2013-04-11, 18:58
+
Edward Sargisson 2013-04-11, 19:03
Copy link to this message
-
Re: Review Request: FLUME-1782 Elastic Search sink does not use UTC to determine the correct index to write to

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10379/#review19358
-----------------------------------------------------------
Edward,

Here are some comments Mike Percy left on the JIRA issue:

Hi Edward,
Thanks for the patch! I have one concern about the approach and a few minor requests/questions.
Concern: Since the sink is modifying the event itself, this can cause issues with memory channel rollback. i.e. if there is a channel rollback because ES is down, then any event processed when it comes back up will actually have the timestamp set by a previous run of the process() method. Depending on why you might want to have a local timestamp, that could be unexpected and/or desired.
Instead of ensuring that the timestamp is set in the event itself at the sink, we could provide some other method of falling back to a local time in the sink itself if the event doesn't have a timestamp set.
tl;dr: Don't modify the Event in the sink
Other minor feedback:
1. It would be great to post this onto Review Board @ https://reviews.apache.org/groups/Flume/ so that individual lines in the patch are easy to reference during review
2. In shouldSetAndWarnWhenNoHeaders() I don't think any Event should ever have null headers. Have you seen cases where that can happen?
3. Nit: can we use a constant for the values 1355364900011 and 1355364900079 that are used throughout this test? They are kinda magic numbers and it could use a little more explanation and/or ensuring they are used consistently by making them private static final variables.
4. Nit: consider using Charsets.UTF_8 from Guava rather than Charset.forName("UTF-8") ... this is optional and not a big deal.
Best,
Mike

- Israel Ekpo
On April 11, 2013, 7:03 p.m., Edward Sargisson wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10379/
> -----------------------------------------------------------
>
> (Updated April 11, 2013, 7:03 p.m.)
>
>
> Review request for Flume and Hari Shreedharan.
>
>
> Description
> -------
>
> This change gets the timestamp from the event and uses it (in UTC) to determine the name of the index to write to. This is required to match the behaviour of Logstash so that Kibana can find the log events.
> The previous code would use the current time and would do it in the timezone of the Flume agent's host.
>
>
> This addresses bug FLUME-1782.
>     https://issues.apache.org/jira/browse/FLUME-1782
>
>
> Diffs
> -----
>
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 693c0d7
>   flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ElasticSearchSink.java 1b3db14
>   flume-ng-sinks/flume-ng-elasticsearch-sink/src/test/java/org/apache/flume/sink/elasticsearch/TestElasticSearchSink.java 94b95b1
>
> Diff: https://reviews.apache.org/r/10379/diff/
>
>
> Testing
> -------
>
> All unit tests and integration tests pass. A snapshot using commit 5b9d31f1ad228 and the patch for flume-1972 has passed our internal integration tests using customisations.
>
>
> Thanks,
>
> Edward Sargisson
>
>

+
Edward Sargisson 2013-04-18, 23:39
+
Mike Percy 2013-04-19, 00:15
+
Edward Sargisson 2013-04-19, 18:04
+
Edward Sargisson 2013-04-22, 20:47