-Reliability in Flume
Henry Ma 2013-01-24, 03:26
Dear Flume developers and users,
I understand that Flume NG uses channel-based transactions to guarantee
reliable message delivery between agents. But in some extreme failure
scenes, will Flume keep total Reliability? I have thought of these scenes
1. In transactions between agent, what will happen if the receiving agent
process down just after it commits its put transaction and before sends the
success indication to the sending agent? Will the sending agent send the
same event again when the receiving agent recovers, and cause data
2. In the communication between the client (data source, sending data to
the first-hop agent) and the first-hop agent, what will happen if the
agent process down just after it receives the event and before saves to its
channel? Will it cause data loss?
3. In the communication between the final-hup agent and the storage system
(such as MySQL, HDFS, file system, etc.), what happened if the agent down
before it commits the saving transaction but has saved some data in the
storage? Will this cause data duplication after the recover of the agent?
Thank you very much!