Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # dev >> hsync is too slower than hflush


+
haosdent 2013-08-25, 05:11
+
Andrew Wang 2013-08-25, 23:07
+
haosdent 2013-08-26, 02:44
+
Andrew Wang 2013-08-26, 03:18
+
haosdent 2013-08-26, 03:21
+
lei liu 2013-08-26, 14:30
Copy link to this message
-
Re: hsync is too slower than hflush
It's syncing the checksum file, so the disk head very likely has to move.
There are rotational seek delays too.
On Mon, Aug 26, 2013 at 7:30 AM, lei liu <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> DataNode sequential write file, so I think the disk seek time should be
> very small.Why is disk seek time 10ms? I think that is too long. Whether we
> can optimize the linux system configuration, reduce disk seek time.
>
>
> 2013/8/26 haosdent <[EMAIL PROTECTED]>
>
> > haha, thank you very much, I get it now.
> >
> > --
> > Best Regards,
> > Haosong Huang
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> >
> >
> > On Monday, August 26, 2013 at 11:18 AM, Andrew Wang wrote:
> >
> > > Ah, I forgot the checksum fsync, so two seeks. Even with 4k writes,
> 50ms
> > > still feels in the right ballpark. Best case it's ~20ms, still way
> slower
> > > than hflush.
> > >
> > > It's also worth asking if there's other dirty data waiting for
> writeback,
> > > since I believe it can also get written out on an fsync.
> > >
> > > hflush doesn't durably write to disk, so you're still in danger of
> losing
> > > data if there's a cluster-wide power outage. Because HDFS writes to two
> > > different racks, hflush still protects you from single-rack outages.
> Most
> > > people think this is good enough (I believe HBase by default runs with
> > just
> > > hflush), but if you *really* want to be sure, pay the cost of hsync and
> > do
> > > durable writes.
> > >
> > >
> > > On Sun, Aug 25, 2013 at 7:44 PM, haosdent <[EMAIL PROTECTED] (mailto:
> > [EMAIL PROTECTED])> wrote:
> > >
> > > > In fact, I just write 4k in every hsync. Datenode would write
> checksum
> > > > file and data file when I hsync data to datanode. Each of them would
> > spent
> > > > nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is
> > very
> > > > fast, which spent both 1ms in write checksum and data. If a hsync
> would
> > > > spent 50ms, what meanings we use it? Or my test way is wrong?
> > > >
> > > > --
> > > > Best Regards,
> > > > Haosong Huang
> > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > >
> > > >
> > > > On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote:
> > > >
> > > > > 50ms is believable. hsync makes each DN call fsync and wait for
> > acks, so
> > > > > you'd expect at least a disk seek time (~10ms) with some extra time
> > > > > depending on how much unsync'd data is being written.
> > > > >
> > > > > So, just as some back of the envelope math, assuming a disk that
> can
> > > > write
> > > > > at 100MB/s:
> > > > >
> > > > > 50ms - 10ms seek = 40ms writing time
> > > > > 100 MB/s * 40ms = 4MB
> > > > >
> > > > > If you're hsync'ing every 4MB, 50ms would be exactly what I'd
> expect.
> > > > >
> > > > > Best,
> > > > > Andrew
> > > > >
> > > > >
> > > > > On Sat, Aug 24, 2013 at 10:11 PM, haosdent <[EMAIL PROTECTED]
> (mailto:
> > [EMAIL PROTECTED]) (mailto:
> > > > [EMAIL PROTECTED] (mailto:[EMAIL PROTECTED]))> wrote:
> > > > >
> > > > > > Hi, all. Hadoop support hsync which would call fsync of system
> > after
> > > > > > 2.0.2. I have tested the performance of hsync() and hflush()
> again
> > and
> > > > > > again, but I found that the hsync call() everytime would spent
> > nearly
> > > > > >
> > > > >
> > > > >
> > > >
> > > > 50ms
> > > > > > while the hflush call() just spent 2ms. In this slide(
> > > > >
> > > >
> > > >
> >
> http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage18
> ),
> > the author mentions that hsync() is 2x slower than hflush(). So,
> > > > > > is anything wrong? Thank you very much and looking forward to
> your
> > > > >
> > > >
> > > > help.
> > > > > >
> > > > > > --
> > > > > > Best Regards,
> > > > > > Haosong Huang
> > > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>