Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> HDFS HA IO Fencing


+
lei liu 2012-10-25, 08:27
+
Todd Lipcon 2012-10-25, 13:08
+
Steve Loughran 2012-10-25, 18:23
+
lei liu 2012-10-26, 08:59
+
Todd Lipcon 2012-10-26, 14:37
+
Steve Loughran 2012-10-26, 17:23
+
lei liu 2012-10-27, 15:59
Copy link to this message
-
Re: HDFS HA IO Fencing
If you use NSFv4 you should be able to use locks and when a machine dies /
fails to renew the lease, the other machine can take over.

On Friday, October 26, 2012, Todd Lipcon wrote:

> NFS Locks typically last forever if you disconnect abruptly. So they are
> not sufficient -- your standby wouldn't be able to take over without manual
> intervention to remove the lock.
>
> If you want to build an unreliable system that might corrupt your data,
> you could set up 'shell(/bin/true)' as a second fencing method. But, it's
> really a bad idea. There are failure scenarios which could cause split
> brain if you do this, and you'd very likely lose data.
>
> -Todd
>
> On Fri, Oct 26, 2012 at 1:59 AM, lei liu <[EMAIL PROTECTED]<javascript:_e({}, 'cvml', '[EMAIL PROTECTED]');>
> > wrote:
>
>> We are using NFS for Shared storage,  Can we use linux nfslcok service to
>> implement IO Fencing ?
>>
>>
>> 2012/10/26 Steve Loughran <[EMAIL PROTECTED] <javascript:_e({},
>> 'cvml', '[EMAIL PROTECTED]');>>
>>
>>>
>>>
>>> On 25 October 2012 14:08, Todd Lipcon <[EMAIL PROTECTED]<javascript:_e({}, 'cvml', '[EMAIL PROTECTED]');>
>>> > wrote:
>>>
>>>> Hi Liu,
>>>>
>>>> Locks are not sufficient, because there is no way to enforce a lock in
>>>> a distributed system without unbounded blocking. What you might be
>>>> referring to is a lease, but leases are still problematic unless you can
>>>> put bounds on the speed with which clocks progress on different machines,
>>>> _and_ have strict guarantees on the way each node's scheduler works. With
>>>> Linux and Java, the latter is tough.
>>>>
>>>>
>>> on any OS running in any virtual environment, including EC2, time is
>>> entirely unpredictable, just to make things worse.
>>>
>>>
>>> On a single machine you can use file locking as the OS will know that
>>> the process is dead and closes the file; other programs can attempt to open
>>> the same file with exclusive locking -and, by getting the right failures,
>>> know that something else has the file, hence the other process is live.
>>> Shared NFS storage you need to mount with softlock set precisely to stop
>>> file locks lasting until some lease has expired, because the on-host
>>> liveness probes detect failure faster and want to react to it.
>>>
>>>
>>> -Steve
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
--
Thanks
-balaji

--
http://balajin.net/blog/
http://flic.kr/balajijegan
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB