Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> sync on writes


+
Mohit Anchlia 2012-08-01, 01:09
+
Alex Baranau 2012-08-01, 13:16
+
Jerry Lam 2012-08-01, 14:10
+
lars hofhansl 2012-08-01, 16:29
Copy link to this message
-
Re: sync on writes
On Wed, Aug 1, 2012 at 9:29 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> "sync" is a fluffy term in HDFS. HDFS has hsync and hflush.
> hflush forces all current changes at a DFSClient to all replica nodes (but
> not to disk).
>
> Until HDFS-744 hsync would be identical to hflush. After HDFS-744 hsync
> can be used to force data to disk at the replicas.
>
>
> When HBase refers to "sync" the hflush semantics are meant (at least until
> HBASE-5954 is finished).
> I.e. a sync here ensures that the replica nodes have seen the changes,
> which is what you want.
>
>
> So when you say "since another copy is always there on the replica nodes",
> that is only guaranteed after an hflush (again, which HBase calls sync).
>
>
> I've also written about this here:
> http://hadoop-hbase.blogspot.com/2012/05/hbase-hdfs-and-durable-sync.html
>
> -- Lars
>
>
>
Thanks this post is very helpful

>
> ________________________________
>  From: Mohit Anchlia <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Tuesday, July 31, 2012 6:09 PM
> Subject: sync on writes
>
> In the HBase book it mentioned that the default behaviour of write is to
> call sync on each node before sending replica copies to the nodes in the
> pipeline. Is there a reason this was kept default because if data is
> getting written on multiple nodes then likelyhood of losing data is really
> low since another copy is always there on the replica nodes. Is it ok to
> make this sync async and is it advisable?
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB