Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Optional Channels

Copy link to this message
Optional Channels
Hoping someone can point me in the right direction.  We're indexing our logs into elastic search just for added real time convenience and want to make that step optional.  Essentially, if we fall behind writing to ES, we would prefer to just skip ES (since we have a more durable channel for higher latency querying of the same data).  Optional Channels seemed to fit, but we haven't had much success.

First, we set our config to have a Memory Channel and made it optional.  If the ES sink fell behind, the channel would fill and reject new events.  However, the channel throws an exception and the Channel Processor rolls back the transaction, causing the events to be put back on the queue to be attempted again.  The doc for getOptionalChannels says "A failure in writing the event to these channels must be ignored."  Should the transaction just always commit when optional channels fail (basically a best-effort commit-what-you-could since it was optional anyway)?

Second, we tried the PseudoTxMemoryChannel, but found it to also continue to bottleneck on ES.  Turns out that it uses queue.put instead of queue.offer, which means it will block until there is room in the queue to add the event.  MemoryChannel uses offer.  Should PseudoTxMemoryChannel switch to using offer always, or at least have an optional 'failFast' to enable that behavior?

Is there another way I can accomplish truly optional channels?  I do find it encouraging it takes this much effort to make Flume drop events :)