Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Re: data loss after cluster wide power loss


Copy link to this message
-
Re: data loss after cluster wide power loss
On Wed, Jul 3, 2013 at 8:12 AM, Colin McCabe <[EMAIL PROTECTED]> wrote:

> On Mon, Jul 1, 2013 at 8:48 PM, Suresh Srinivas <[EMAIL PROTECTED]>
> wrote:
> > Dave,
> >
> > Thanks for the detailed email. Sorry I did not read all the details you
> had
> > sent earlier completely (on my phone). As you said, this is not related
> to
> > data loss related to HBase log and hsync. I think you are right; the
> rename
> > operation itself might not have hit the disk. I think we should either
> > ensure metadata operation is synced on the datanode or handle it being
> > reported as blockBeingWritten. Let me spend sometime to debug this issue.
>
> In theory, ext3 is journaled, so all metadata operations should be
> durable in the case of a power outage.  It is only data operations
> that should be possible to lose.  It is the same for ext4.  (Assuming
> you are not using nonstandard mount options.)
>

ext3 journal may not hit the disk right. From what I read, if you do not
specifically
call sync, even the metadata operations do not hit disk.

See - https://www.kernel.org/doc/Documentation/filesystems/ext3.txt

commit=nrsec (*) Ext3 can be told to sync all its data and metadata
every 'nrsec' seconds. The default value is 5 seconds.
This means that if you lose your power, you will lose
as much as the latest 5 seconds of work (your
filesystem will not be damaged though, thanks to the
journaling).  This default value (or any low value)
will hurt performance, but it's good for data-safety.
Setting it to 0 will have the same effect as leaving
it at the default (5 seconds).
Setting it to very large values will improve

performance.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB