Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - File Channel issue - recovering from BadCheckpoint exception


Copy link to this message
-
Re: File Channel issue - recovering from BadCheckpoint exception
Brock Noland 2013-06-01, 02:08
I think we could add JUnit Assume statements for any tests which depend on
this value since it will be auto disabled on windows.
On Fri, May 31, 2013 at 6:15 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

> I am not sure who this is handled generally by Windows developers, but I'd
> assume there is a way to do that. I am fairly sure this is a known issue. I
> think the only thing we can do for now is to disable those unit tests if
> the build is on windows or have an if-else that tests the expected behavior
> on Windows. I don't really like having different behavior on Windows and
> posix platforms, but if the platform itself behaves in a specific way, I
> doubt there is anything we can do.
>
> In case of the dual checkpoints, we might be ok - because we actually
> don't open the files. We just create them and then copy the content and
> then close them.
>
>
> Cheers,
> Hari
>
>
> On Friday, May 31, 2013 at 4:01 PM, Roshan Naik wrote:
>
> > i am concerned several unit tests might be dependent on the
> auto-deletion.
> >
> >
> > On Fri, May 31, 2013 at 3:57 PM, Hari Shreedharan <
> [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])
> > > wrote:
> >
> >
> > > Roshan,
> > >
> > > No, that would break all config files from Flume 1.3.0 and Flume
> 1.3.1. We
> > > should probably have some code that specifically disables this on
> Windows
> > > and clearly document that.
> > >
> > >
> > > Cheers,
> > > Hari
> > >
> > >
> > > On Friday, May 31, 2013 at 3:51 PM, Roshan Naik wrote:
> > >
> > > > Would it make sense for default config setting for the auto-deletion
> to
> > > be
> > > > set to 'false' then ?
> > > >
> > > >
> > > > On Fri, May 31, 2013 at 3:16 PM, Hari Shreedharan <
> > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])
> > > > > wrote:
> > > >
> > > >
> > > >
> > > > > For now, how about making the auto-deletion configurable? If it is
> > > > > configured not to delete, then don't even try to startup the
> channel.
> > > > >
> > > >
> > > >
> > >
> > > This
> > > > > will bring in the pre-1.3.0 behavior where the channel's recovery
> is
> > > > > manual? I suspect you are going to hit many more issues when you
> enable
> > > > > dual checkpoints - and fixing that is going to be non-trivial.
> > > > >
> > > > > Cheers,
> > > > > Hari
> > > > >
> > > > >
> > > > > On Friday, May 31, 2013 at 2:53 PM, Roshan Naik wrote:
> > > > >
> > > > > > In EventQueueBackingStoreFileV3 constructor, if it detects that
> the
> > > > > > checkpoint and meta files have differing logWriteOrderIds, it
> throws
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > > a
> > > > > > BadCheckpointException. Controls goes back to the exception
> handler
> > > > >
> > > >
> > >
> > > in
> > > > > > Log.replay() which attempts to delete all the files in checkpoint
> > > > >
> > > > >
> > > > > directory
> > > > > > and start fresh. The same file names are reused when starting
> fresh.
> > > > > >
> > > > > > Unfortunately this does not work on Windows since the deletion of
> > > > > > the checkpoint file in the checkpointDir fails. The failure is
> due
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > > to the
> > > > > > fact that the checkpoint file is memory mapped. Unless it is
> > > > >
> > > >
> > >
> > > unmapped the
> > > > > > deletion will not succeed... and unfortunately Java does not have
> > > > >
> > > >
> > >
> > > unmap
> > > > > > support. Windows does not permit deletion (or renaming) of files
> in
> > > > >
> > > >
> > >
> > > use.
> > > > > >
> > > > > > The obvious thought i am having is that when starting fresh we
> delete
> > > > > > whatever we can and invent a new file name for the ones we cant
> (i
> > > > > >
> > > > >
> > > >
> > >
> > > think
> > > > > > for checkpoint file only)
> > > > > >
> > > > > > thoughts ?
> > > > > >
> > > > > > -roshan
>
>
--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org