Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - File Channel issue - recovering from BadCheckpoint exception


Copy link to this message
-
Re: File Channel issue - recovering from BadCheckpoint exception
Roshan Naik 2013-05-31, 22:51
Would it make sense for default config setting for the auto-deletion to be
set to 'false'  then ?
On Fri, May 31, 2013 at 3:16 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

> For now, how about making the auto-deletion configurable? If it is
> configured not to delete, then don't even try to startup the channel. This
> will bring in the pre-1.3.0 behavior where the channel's recovery is
> manual? I suspect you are going to hit many more issues when you enable
> dual checkpoints - and fixing that is going to be non-trivial.
>
> Cheers,
> Hari
>
>
> On Friday, May 31, 2013 at 2:53 PM, Roshan Naik wrote:
>
> > In EventQueueBackingStoreFileV3 constructor, if it detects that the
> > checkpoint and meta files have differing logWriteOrderIds, it throws a
> > BadCheckpointException. Controls goes back to the exception handler in
> > Log.replay() which attempts to delete all the files in checkpoint
> directory
> > and start fresh. The same file names are reused when starting fresh.
> >
> > Unfortunately this does not work on Windows since the deletion of
> > the checkpoint file in the checkpointDir fails. The failure is due to the
> > fact that the checkpoint file is memory mapped. Unless it is unmapped the
> > deletion will not succeed... and unfortunately Java does not have unmap
> > support. Windows does not permit deletion (or renaming) of files in use.
> >
> > The obvious thought i am having is that when starting fresh we delete
> > whatever we can and invent a new file name for the ones we cant (i think
> > for checkpoint file only)
> >
> > thoughts ?
> >
> > -roshan
>
>