Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Usage of  use-fast-replay for FileChannel


+
Rahul Ravindran 2013-05-07, 04:40
Copy link to this message
-
Re: Usage of use-fast-replay for FileChannel
Did you have an issue with the checkpoint that the entire 6G of data was
replayed (look for BadCheckpointException in the logs to figure out if the
channel was stopped in middle of a checkpoint)?

With the next version of Flume, you should be able to recover even if the
channel stopped while the checkpoint was being written.

Fast Replay will try to maintain order, but it will require a massive
amount of memory to run if you have a large number of events. Also, fast
replay will only run if the checkpoint is corrupt/does not exist.

Hari
On Mon, May 6, 2013 at 9:40 PM, Rahul Ravindran <[EMAIL PROTECTED]> wrote:

> Hi,
>    For FileChannel, how much of a performance improvement in replay times
> were observed with use-fast-replay? We currently have use-fast-replay set
> to false and were replaying about 6 G of data. We noticed replay times of
> about one hour. I looked at the code and it appears that fast-replay does
> not guarantee the same ordering of events during replay. Is this accurate?
> Are there any other downsides of using fast-replay? Any stability concerns?
> Thanks,
> ~Rahul.
>
+
Rahul Ravindran 2013-05-07, 15:24
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB