Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - File Channel issue - recovering from BadCheckpoint exception


Copy link to this message
-
Re: File Channel issue - recovering from BadCheckpoint exception
Roshan Naik 2013-05-31, 23:01
i am concerned several unit tests might be dependent on the auto-deletion.
On Fri, May 31, 2013 at 3:57 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

> Roshan,
>
> No, that would break all config files from Flume 1.3.0 and Flume 1.3.1. We
> should probably have some code that specifically disables this on Windows
> and clearly document that.
>
>
> Cheers,
> Hari
>
>
> On Friday, May 31, 2013 at 3:51 PM, Roshan Naik wrote:
>
> > Would it make sense for default config setting for the auto-deletion to
> be
> > set to 'false' then ?
> >
> >
> > On Fri, May 31, 2013 at 3:16 PM, Hari Shreedharan <
> [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])
> > > wrote:
> >
> >
> > > For now, how about making the auto-deletion configurable? If it is
> > > configured not to delete, then don't even try to startup the channel.
> This
> > > will bring in the pre-1.3.0 behavior where the channel's recovery is
> > > manual? I suspect you are going to hit many more issues when you enable
> > > dual checkpoints - and fixing that is going to be non-trivial.
> > >
> > > Cheers,
> > > Hari
> > >
> > >
> > > On Friday, May 31, 2013 at 2:53 PM, Roshan Naik wrote:
> > >
> > > > In EventQueueBackingStoreFileV3 constructor, if it detects that the
> > > > checkpoint and meta files have differing logWriteOrderIds, it throws
> a
> > > > BadCheckpointException. Controls goes back to the exception handler
> in
> > > > Log.replay() which attempts to delete all the files in checkpoint
> > > >
> > >
> > > directory
> > > > and start fresh. The same file names are reused when starting fresh.
> > > >
> > > > Unfortunately this does not work on Windows since the deletion of
> > > > the checkpoint file in the checkpointDir fails. The failure is due
> to the
> > > > fact that the checkpoint file is memory mapped. Unless it is
> unmapped the
> > > > deletion will not succeed... and unfortunately Java does not have
> unmap
> > > > support. Windows does not permit deletion (or renaming) of files in
> use.
> > > >
> > > > The obvious thought i am having is that when starting fresh we delete
> > > > whatever we can and invent a new file name for the ones we cant (i
> think
> > > > for checkpoint file only)
> > > >
> > > > thoughts ?
> > > >
> > > > -roshan
>
>