Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> File Channel issue - recovering from BadCheckpoint exception


Copy link to this message
-
Re: File Channel issue - recovering from BadCheckpoint exception
I am not sure who this is handled generally by Windows developers, but I'd assume there is a way to do that. I am fairly sure this is a known issue. I think the only thing we can do for now is to disable those unit tests if the build is on windows or have an if-else that tests the expected behavior on Windows. I don't really like having different behavior on Windows and posix platforms, but if the platform itself behaves in a specific way, I doubt there is anything we can do.  

In case of the dual checkpoints, we might be ok - because we actually don't open the files. We just create them and then copy the content and then close them.
Cheers,
Hari
On Friday, May 31, 2013 at 4:01 PM, Roshan Naik wrote:

> i am concerned several unit tests might be dependent on the auto-deletion.
>
>
> On Fri, May 31, 2013 at 3:57 PM, Hari Shreedharan <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])
> > wrote:
>
>
> > Roshan,
> >
> > No, that would break all config files from Flume 1.3.0 and Flume 1.3.1. We
> > should probably have some code that specifically disables this on Windows
> > and clearly document that.
> >
> >
> > Cheers,
> > Hari
> >
> >
> > On Friday, May 31, 2013 at 3:51 PM, Roshan Naik wrote:
> >
> > > Would it make sense for default config setting for the auto-deletion to
> > be
> > > set to 'false' then ?
> > >
> > >
> > > On Fri, May 31, 2013 at 3:16 PM, Hari Shreedharan <
> > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])
> > > > wrote:
> > >
> > >
> > >
> > > > For now, how about making the auto-deletion configurable? If it is
> > > > configured not to delete, then don't even try to startup the channel.
> > > >
> > >
> > >
> >
> > This
> > > > will bring in the pre-1.3.0 behavior where the channel's recovery is
> > > > manual? I suspect you are going to hit many more issues when you enable
> > > > dual checkpoints - and fixing that is going to be non-trivial.
> > > >
> > > > Cheers,
> > > > Hari
> > > >
> > > >
> > > > On Friday, May 31, 2013 at 2:53 PM, Roshan Naik wrote:
> > > >
> > > > > In EventQueueBackingStoreFileV3 constructor, if it detects that the
> > > > > checkpoint and meta files have differing logWriteOrderIds, it throws
> > > > >
> > > >
> > > >
> > >
> >
> > a
> > > > > BadCheckpointException. Controls goes back to the exception handler
> > > >
> > >
> >
> > in
> > > > > Log.replay() which attempts to delete all the files in checkpoint
> > > >
> > > >
> > > > directory
> > > > > and start fresh. The same file names are reused when starting fresh.
> > > > >
> > > > > Unfortunately this does not work on Windows since the deletion of
> > > > > the checkpoint file in the checkpointDir fails. The failure is due
> > > > >
> > > >
> > > >
> > >
> >
> > to the
> > > > > fact that the checkpoint file is memory mapped. Unless it is
> > > >
> > >
> >
> > unmapped the
> > > > > deletion will not succeed... and unfortunately Java does not have
> > > >
> > >
> >
> > unmap
> > > > > support. Windows does not permit deletion (or renaming) of files in
> > > >
> > >
> >
> > use.
> > > > >
> > > > > The obvious thought i am having is that when starting fresh we delete
> > > > > whatever we can and invent a new file name for the ones we cant (i
> > > > >
> > > >
> > >
> >
> > think
> > > > > for checkpoint file only)
> > > > >
> > > > > thoughts ?
> > > > >
> > > > > -roshan

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB