-Re: HDFS HA IO Fencing
Steve Loughran 2012-10-25, 18:23
On 25 October 2012 14:08, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> Hi Liu,
> Locks are not sufficient, because there is no way to enforce a lock in a
> distributed system without unbounded blocking. What you might be referring
> to is a lease, but leases are still problematic unless you can put bounds
> on the speed with which clocks progress on different machines, _and_ have
> strict guarantees on the way each node's scheduler works. With Linux and
> Java, the latter is tough.
on any OS running in any virtual environment, including EC2, time is
entirely unpredictable, just to make things worse.
On a single machine you can use file locking as the OS will know that the
process is dead and closes the file; other programs can attempt to open the
same file with exclusive locking -and, by getting the right failures, know
that something else has the file, hence the other process is live. Shared
NFS storage you need to mount with softlock set precisely to stop file
locks lasting until some lease has expired, because the on-host liveness
probes detect failure faster and want to react to it.