Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Hadoop hardware failure recovery


Copy link to this message
-
RE: Hadoop hardware failure recovery
This is never an issue on vSphere.  The ESXi hypervisor does not send the completion interrupt back to the guest until the IO is finished, so if the guest OS thinks an IO is flushed to disk, it really is flushed to disk. hsync() will work in a ESXi VM exactly like in a native OS.

The physical storage layer might lie about completion (e.g., most SANs with redundant battery-backed caches), but this applies equally to native and virtualized OSes.

It is always tempting to implement some kind of write caching in the virtualization layer to try to improve storage performance, but of course this comes at the cost of safety and predictability.

Jeff

From: Steve Loughran [mailto:[EMAIL PROTECTED]]
Sent: Monday, August 13, 2012 8:08 AM
To: [EMAIL PROTECTED]
Subject: Re: Hadoop hardware failure recovery
On 13 August 2012 07:55, Harsh J <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:

Note that with 2.1.0 (upcoming) and above releases of HDFS, we offer a
working hsync() API that allows you to write files with guarantee that
it has been written to the disk (like the fsync() *nix call).

A guarantee that the OS thinks it's been written to HDD.

For anyone using Hadoop or any other program (e.g MySQL) in a virtualized environment , even when the OS thinks it has flushed a virtual disk -know that you may have set some VM params to say "when we said "flush to disk" we meant it":
https://forums.virtualbox.org/viewtopic.php?t=13661