Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> sync on writes


Copy link to this message
-
Re: sync on writes
I believe you are talking about enabling dfs.support.append feature? I
benchmarked the difference (disable/enable) previously and I don't find
much differences. It would be great if someone else can confirm on this.

Best Regards,

Jerry

On Wednesday, August 1, 2012, Alex Baranau wrote:

> I believe that this is *not default*, but *current* implementation of
> sync(). I.e. (please correct me if I'm wrong) n-way write approach is not
> available yet.
> You might confuse it with the fact that by default, sync() is called on
> every edit. And you can change it by using "deferred log flushing". Either
> way, sync() is going to be a pipelined write.
>
> There's an explanation of benefits of pipelined and n-way writes there in
> the book (p337), it's not just about which approach provides better
>  durability of saved edits. Both of them do. But both can take different
> time to execute and utilize network differently: pipelined *may* be slower
> but can saturate network bandwidth better.
>
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>
> On Tue, Jul 31, 2012 at 9:09 PM, Mohit Anchlia <[EMAIL PROTECTED]<javascript:;>
> >wrote:
>
> > In the HBase book it mentioned that the default behaviour of write is to
> > call sync on each node before sending replica copies to the nodes in the
> > pipeline. Is there a reason this was kept default because if data is
> > getting written on multiple nodes then likelyhood of losing data is
> really
> > low since another copy is always there on the replica nodes. Is it ok to
> > make this sync async and is it advisable?
> >
>
>
>
> --
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>