Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Guarantees of the memory channel for delivering to sink


Copy link to this message
-
Re: Guarantees of the memory channel for delivering to sink
Rahul,

If we choose to use file channel with this source, we will result in double
> writes to disk, correct? (one for the legacy log files which will be
> ingested by the Spool Directory source, and the other for the WAL)
>
>
Yes that will lead to double disk writes if you go with file channel. For
your use case, i am thinking, you may go for the memory channel instead if
you live with "small" data loss. To mitigate data loss having a smaller
size memory channel will help.  For this to work reasonably well, the
source would need the ability to resume (on restart) from the last event
it committed into the channel. The amount of data loss would be limited to
your memory channel's capacity and you will avoid double disk I/O.

 I dont know if the Spool Directory source knows precisely where to resume
from after a restart (following a crash).  Brock ?
-roshan
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB