Aji Janis 2012-08-10, 18:38
Ted Dunning 2012-08-10, 18:55
anil gupta 2012-08-10, 19:12
Mohammad Tariq 2012-08-10, 19:16
Harsh J 2012-08-12, 12:43
Arun C Murthy 2012-08-12, 19:07
Aji Janis 2012-08-13, 13:57
Harsh J 2012-08-13, 14:55
Mohammad Tariq 2012-08-10, 18:43
Harsh J 2012-08-12, 12:45
Mohammad Tariq 2012-08-12, 17:47
Harsh J 2012-08-13, 15:42
Steve Loughran 2012-08-13, 16:10
-RE: Hadoop hardware failure recovery
Jeffrey Buell 2012-08-14, 00:15
This is never an issue on vSphere. The ESXi hypervisor does not send the completion interrupt back to the guest until the IO is finished, so if the guest OS thinks an IO is flushed to disk, it really is flushed to disk. hsync() will work in a ESXi VM exactly like in a native OS.
The physical storage layer might lie about completion (e.g., most SANs with redundant battery-backed caches), but this applies equally to native and virtualized OSes.
It is always tempting to implement some kind of write caching in the virtualization layer to try to improve storage performance, but of course this comes at the cost of safety and predictability.
From: Steve Loughran [mailto:[EMAIL PROTECTED]]
Sent: Monday, August 13, 2012 8:08 AM
To: [EMAIL PROTECTED]
Subject: Re: Hadoop hardware failure recovery
On 13 August 2012 07:55, Harsh J <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Note that with 2.1.0 (upcoming) and above releases of HDFS, we offer a
working hsync() API that allows you to write files with guarantee that
it has been written to the disk (like the fsync() *nix call).
A guarantee that the OS thinks it's been written to HDD.
For anyone using Hadoop or any other program (e.g MySQL) in a virtualized environment , even when the OS thinks it has flushed a virtual disk -know that you may have set some VM params to say "when we said "flush to disk" we meant it":