Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - HDFS read/write data throttling


Copy link to this message
-
Re: HDFS read/write data throttling
Andrew Wang 2013-11-18, 21:25
https://issues.apache.org/jira/browse/HDFS-5499
On Mon, Nov 18, 2013 at 10:46 AM, Jay Vyas <[EMAIL PROTECTED]> wrote:

> Where is the jira for this?
>
> Sent from my iPhone
>
> > On Nov 18, 2013, at 1:25 PM, Andrew Wang <[EMAIL PROTECTED]>
> wrote:
> >
> > Thanks for asking, here's a link:
> >
> > http://www.umbrant.com/papers/socc12-cake.pdf
> >
> > I don't think there's a recording of my talk unfortunately.
> >
> > I'll also copy my comments over to the JIRA, though I'd like to not
> > distract too much from what Lohit's trying to do.
> >
> >
> > On Wed, Nov 13, 2013 at 2:54 AM, Steve Loughran <[EMAIL PROTECTED]
> >wrote:
> >
> >> this is interesting -I've moved my comments over to the JIRA and it
> would
> >> be good for yours to go there too.
> >>
> >> is there a URL for your paper?
> >>
> >>
> >>> On 13 November 2013 06:27, Andrew Wang <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>> Hey Steve,
> >>>
> >>> My research project (Cake, published at SoCC '12) was trying to provide
> >>> SLAs for mixed workloads of latency-sensitive and throughput-bound
> >>> applications, e.g. HBase running alongside MR. This was challenging
> >> because
> >>> seeks are a real killer. Basically, we had to strongly limit MR I/O to
> >> keep
> >>> worst-case seek latency down, and did so by putting schedulers on the
> RPC
> >>> queues in HBase and HDFS to restrict queuing in the OS and disk where
> we
> >>> lacked preemption.
> >>>
> >>> Regarding citations of note, most academics consider throughput-sharing
> >> to
> >>> be a solved problem. It's not dissimilar from normal time slicing, you
> >> try
> >>> to ensure fairness over some coarse timescale. I think cgroups [1] and
> >>> ioprio_set [2] essentially provide this.
> >>>
> >>> Mixing throughput and latency though is difficult, and my conclusion is
> >>> that there isn't a really great solution for spinning disks besides
> >>> physical isolation. As we all know, you can get either IOPS or
> bandwidth,
> >>> but not both, and it's not a linear tradeoff between the two. If you're
> >>> interested in this though, I can dig up some related work from my Cake
> >>> paper.
> >>>
> >>> However, since it seems that we're more concerned with throughput-bound
> >>> apps, we might be okay just using cgroups and ioprio_set to do
> >>> time-slicing. I actually hacked up some code a while ago which passed a
> >>> client-provided priority byte to the DN, which used it to set the I/O
> >>> priority of the handling DataXceiver accordingly. This isn't the most
> >>> outlandish idea, since we've put QoS fields in our RPC protocol for
> >>> instance; this would just be another byte. Short-circuit reads are
> >> outside
> >>> this paradigm, but then you can use cgroup controls instead.
> >>>
> >>> My casual conversations with Googlers indicate that there isn't any
> >> special
> >>> Borg/Omega sauce either, just that they heavily prioritize DFS I/O over
> >>> non-DFS. Maybe that's another approach: if we can separate block
> >> management
> >>> in HDFS, MR tasks could just write their output to a raw HDFS block,
> thus
> >>> bringing a lot of I/O back into the fold of "datanode as I/O manager"
> >> for a
> >>> machine.
> >>>
> >>> Overall, I strongly agree with you that it's important to first define
> >> what
> >>> our goals are regarding I/O QoS. The general case is a tarpit, so it'd
> be
> >>> good to carve off useful things that can be done now (like Lohit's
> >>> direction of per-stream/FS throughput throttling with trusted clients)
> >> and
> >>> then carefully grow the scope as we find more usecases we can
> confidently
> >>> solve.
> >>>
> >>> Best,
> >>> Andrew
> >>>
> >>> [1] cgroups blkio controller
> >>> https://www.kernel.org/doc/Documentation/cgroups/blkio-controller.txt
> >>> [2] ioprio_set http://man7.org/linux/man-pages/man2/ioprio_set.2.html
> >>>
> >>>
> >>> On Tue, Nov 12, 2013 at 1:38 AM, Steve Loughran <
> [EMAIL PROTECTED]
> >>>> wrote:
> >>>
> >>>> I've looked at it a bit within the context of YARN.