Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Usage of  use-fast-replay for FileChannel

Copy link to this message
Re: Usage of use-fast-replay for FileChannel
Did you have an issue with the checkpoint that the entire 6G of data was
replayed (look for BadCheckpointException in the logs to figure out if the
channel was stopped in middle of a checkpoint)?

With the next version of Flume, you should be able to recover even if the
channel stopped while the checkpoint was being written.

Fast Replay will try to maintain order, but it will require a massive
amount of memory to run if you have a large number of events. Also, fast
replay will only run if the checkpoint is corrupt/does not exist.

On Mon, May 6, 2013 at 9:40 PM, Rahul Ravindran <[EMAIL PROTECTED]> wrote:

> Hi,
>    For FileChannel, how much of a performance improvement in replay times
> were observed with use-fast-replay? We currently have use-fast-replay set
> to false and were replaying about 6 G of data. We noticed replay times of
> about one hour. I looked at the code and it appears that fast-replay does
> not guarantee the same ordering of events during replay. Is this accurate?
> Are there any other downsides of using fast-replay? Any stability concerns?
> Thanks,
> ~Rahul.