Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> [FLUME-1995] Remote Channel for Apache Flume

Copy link to this message
[FLUME-1995] Remote Channel for Apache Flume
I wanted to get some feedback from others before deciding whether or not to
continue working on this.

I initially filed this improvement/new feature because of use cases where
there is a hardware failure on the machine where the agent is currently

In terms of disaster recovery, having the events queue up on a remote
machine (preferably in the same internal network) will allow another agent
with the same configuration to pick it up from another machine and restart
the process of data transport towards the sink.

Sometimes, events may take a while to process and they may end up staying
in the channels (FileChannel) for a long time, during which hardware
failure could occur.

If the data in the events is mission critical, this could cause a lot of
headaches if there is no easy way to recover from the hardware failure
after events have been queued up in the file channel.

What are your thoughts towards the remote channel? I understand there is a
JDBC Channel (http://flume.apache.org/FlumeUserGuide.html#jdbc-channel) but
I have heard it has performance issues.

This is why I am deciding to use a NoSQL store to solve this.

I would like to get some feedback from others so that I can prioritize the
tasks in my JIRA queue especially with the 1.4.0 release deadline drawing