What do you mean single-threaded model? Almost all of Flume's components are multithreaded - if you mean sink being driven by one thread - you can always add more sinks - and each one will be driven by its own thread. If you want to write the same data to multiple locations - just add more channels to the same source (thus replicating the data) and attach the sinks as required - this will allow you to get data to multiple locations. If you want to write to higher latency location, you an either add multiple sinks reading from the same channel (thus creating multiple sink runners), or make your sink multithreaded (spawn multiple threads inside the process method and then wait for all threads to succeed/fail), so more threads do I/O.
On Wednesday, November 7, 2012 at 10:48 AM, Nathaniel Auvil wrote:
> in addition to HDFS, i need to support sending events to a higher latency (network related) target which in our current implementation mitigates by using more than one thread. The model for Flume is single threaded. How do I support this with Flume? multiplex over n channels with a sink on each ?