|
|
Henry Ma 2013-01-24, 03:26
Dear Flume developers and users,
I understand that Flume NG uses channel-based transactions to guarantee reliable message delivery between agents. But in some extreme failure scenes, will Flume keep total Reliability? I have thought of these scenes below.
1. In transactions between agent, what will happen if the receiving agent process down just after it commits its put transaction and before sends the success indication to the sending agent? Will the sending agent send the same event again when the receiving agent recovers, and cause data duplication?
2. In the communication between the client (data source, sending data to the first-hop agent) and the first-hop agent, what will happen if the agent process down just after it receives the event and before saves to its channel? Will it cause data loss?
3. In the communication between the final-hup agent and the storage system (such as MySQL, HDFS, file system, etc.), what happened if the agent down before it commits the saving transaction but has saved some data in the storage? Will this cause data duplication after the recover of the agent?
Thank you very much! -- Best Regards, Henry Ma
+
Henry Ma 2013-01-24, 03:26
-
Re: Reliability in Flume
Juhani Connolly 2013-01-24, 07:33
Hi Henry,
Just to add to Mike's response:
When used with secure channels(mainly file channel) and with transports that can be rolled back(avro), message delivery is guarranteed(eventually). The only way you can lose data is for a part of the chain to be permanently removed: HD failure or removal of the physical hardware.
Prevention of data duplication has never been an objective of flume, though it is uncommon in a properly configured setup. The larger your batch sizes are, the more duplication you may get with each partial failure. Similarly ordered arrival of data is not guarranteed. The best way to address these two issues, if it is a concern, is to run a map-reduce task or similar to reduce to unique rows and/or reorder.
On 01/24/2013 12:26 PM, Henry Ma wrote: > Dear Flume developers and users, > > I understand that Flume NG uses channel-based transactions to > guarantee reliable message delivery between agents. But in > some extreme failure scenes, will Flume keep total Reliability? I have > thought of these scenes below. > > 1. In transactions between agent, what will happen if the receiving > agent process down just after it commits its put transaction and > before sends the success indication to the sending agent? Will the > sending agent send the same event again when the receiving agent > recovers, and cause data duplication? > > 2. In the communication between the client (data source, sending data > to the first-hop agent) and the first-hop agent, what will happen if > the agent process down just after it receives the event and before > saves to its channel? Will it cause data loss? > > 3. In the communication between the final-hup agent and the storage > system (such as MySQL, HDFS, file system, etc.), what happened if the > agent down before it commits the saving transaction but has saved some > data in the storage? Will this cause data duplication after the > recover of the agent? > > Thank you very much! > -- > Best Regards, > Henry Ma
+
Juhani Connolly 2013-01-24, 07:33
-
Re: Reliability in Flume
Mike Percy 2013-01-24, 05:22
Henry, Please see inline...
On Wed, Jan 23, 2013 at 7:26 PM, Henry Ma <[EMAIL PROTECTED]> wrote:
> Dear Flume developers and users, > > I understand that Flume NG uses channel-based transactions to guarantee > reliable message delivery between agents. But in some extreme failure > scenes, will Flume keep total Reliability? I have thought of these scenes > below. > > 1. In transactions between agent, what will happen if the receiving agent > process down just after it commits its put transaction and before sends the > success indication to the sending agent? Will the sending agent send the > same event again when the receiving agent recovers, and cause data > duplication? >
Yes it will cause duplication in this case. But it's not that common if you do proper capacity planning and tuning.
2. In the communication between the client (data source, sending data to > the first-hop agent) and the first-hop agent, what will happen if the > agent process down just after it receives the event and before saves to its > channel? Will it cause data loss? >
It will not cause data loss because it saves to the channel before acknowledging the transaction.
3. In the communication between the final-hup agent and the storage system > (such as MySQL, HDFS, file system, etc.), what happened if the agent down > before it commits the saving transaction but has saved some data in the > storage? Will this cause data duplication after the recover of the agent? >
Yes, this scenario can also cause duplicates.
Regards, Mike
+
Mike Percy 2013-01-24, 05:22
|
|