Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Duplicate on failover


Copy link to this message
-
Re: Duplicate on failover
Hari Shreedharan 2012-08-30, 16:22
Hi Ralph,

Sorry missed this message earlier. How are you simulating failover in your test - I did not look at your code. If the message was written by the Avro Source on the client and the Avro Sink on the other side simply did not get a success would cause the failover sink processor to retry the same message since it would be rolled back by the sink, and hence the channel will end up making it available for another sink. Generally, if a message is not ack-ed as being successfully written to the channel by the Avro Source, the sink will rollback the transaction - and throw an EventDeliveryException - and in case of Failover SinkProcessor, it will cause the next sink to pick it up.

Also, note that Flume guarantees at least once semantics and weak ordering. If a failure happens, it is possible that there will be duplicates.

And no, this is not related to any of the FileChannel issues we have been fixing.

Thanks,
Hari

--
Hari Shreedharan
On Thursday, August 30, 2012 at 7:50 AM, Ralph Goers wrote:

> I'm going to try again. Does this problem sound familiar to anyone?
>
> Ralph
>
> On Aug 27, 2012, at 3:36 PM, Ralph Goers wrote:
>
> > Does anyone have any thoughts on this? Is it possibly related to any of the issues already being fixed on the FileChannel?
> >
> > Ralph
> >
> > On Aug 26, 2012, at 4:05 PM, Ralph Goers wrote:
> >
> > > I have successfully embedded Flume into the Log4j 2 Appender. However, I have a unit test that has Flume fail over from one AvroSink to another. When this happens under some circumstances I am getting the last message successfully delivered to the first source as the first message to the second source, which doesn't seem correct. The unit test is athttps://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java (http://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java). The odd thing is that I cannot get this to fail on my local machine - it only fails when Gump runs it, but it fail fairly consistently.
> > >
> > > The unit test has the AppenderSource connect to a FileChannel. Two AvroSinks are connected to the FileChannel via the Failover processor.
> > >
> > > Is this a known behavior?
> > >
> > > Ralph