Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> HDFS read/write data throttling


Copy link to this message
-
Re: HDFS read/write data throttling
https://issues.apache.org/jira/browse/HDFS-5499
On Mon, Nov 18, 2013 at 10:46 AM, Jay Vyas <[EMAIL PROTECTED]> wrote:

> Where is the jira for this?
>
> Sent from my iPhone
>
> > On Nov 18, 2013, at 1:25 PM, Andrew Wang <[EMAIL PROTECTED]>
> wrote:
> >
> > Thanks for asking, here's a link:
> >
> > http://www.umbrant.com/papers/socc12-cake.pdf
> >
> > I don't think there's a recording of my talk unfortunately.
> >
> > I'll also copy my comments over to the JIRA, though I'd like to not
> > distract too much from what Lohit's trying to do.
> >
> >
> > On Wed, Nov 13, 2013 at 2:54 AM, Steve Loughran <[EMAIL PROTECTED]
> >wrote:
> >
> >> this is interesting -I've moved my comments over to the JIRA and it
> would
> >> be good for yours to go there too.
> >>
> >> is there a URL for your paper?
> >>
> >>
> >>> On 13 November 2013 06:27, Andrew Wang <[EMAIL PROTECTED]>
> wrote:
> >>>
> >>> Hey Steve,
> >>>
> >>> My research project (Cake, published at SoCC '12) was trying to provide
> >>> SLAs for mixed workloads of latency-sensitive and throughput-bound
> >>> applications, e.g. HBase running alongside MR. This was challenging
> >> because
> >>> seeks are a real killer. Basically, we had to strongly limit MR I/O to
> >> keep
> >>> worst-case seek latency down, and did so by putting schedulers on the
> RPC
> >>> queues in HBase and HDFS to restrict queuing in the OS and disk where
> we
> >>> lacked preemption.
> >>>
> >>> Regarding citations of note, most academics consider throughput-sharing
> >> to
> >>> be a solved problem. It's not dissimilar from normal time slicing, you
> >> try
> >>> to ensure fairness over some coarse timescale. I think cgroups [1] and
> >>> ioprio_set [2] essentially provide this.
> >>>
> >>> Mixing throughput and latency though is difficult, and my conclusion is
> >>> that there isn't a really great solution for spinning disks besides
> >>> physical isolation. As we all know, you can get either IOPS or
> bandwidth,
> >>> but not both, and it's not a linear tradeoff between the two. If you're
> >>> interested in this though, I can dig up some related work from my Cake
> >>> paper.
> >>>
> >>> However, since it seems that we're more concerned with throughput-bound
> >>> apps, we might be okay just using cgroups and ioprio_set to do
> >>> time-slicing. I actually hacked up some code a while ago which passed a
> >>> client-provided priority byte to the DN, which used it to set the I/O
> >>> priority of the handling DataXceiver accordingly. This isn't the most
> >>> outlandish idea, since we've put QoS fields in our RPC protocol for
> >>> instance; this would just be another byte. Short-circuit reads are
> >> outside
> >>> this paradigm, but then you can use cgroup controls instead.
> >>>
> >>> My casual conversations with Googlers indicate that there isn't any
> >> special
> >>> Borg/Omega sauce either, just that they heavily prioritize DFS I/O over
> >>> non-DFS. Maybe that's another approach: if we can separate block
> >> management
> >>> in HDFS, MR tasks could just write their output to a raw HDFS block,
> thus
> >>> bringing a lot of I/O back into the fold of "datanode as I/O manager"
> >> for a
> >>> machine.
> >>>
> >>> Overall, I strongly agree with you that it's important to first define
> >> what
> >>> our goals are regarding I/O QoS. The general case is a tarpit, so it'd
> be
> >>> good to carve off useful things that can be done now (like Lohit's
> >>> direction of per-stream/FS throughput throttling with trusted clients)
> >> and
> >>> then carefully grow the scope as we find more usecases we can
> confidently
> >>> solve.
> >>>
> >>> Best,
> >>> Andrew
> >>>
> >>> [1] cgroups blkio controller
> >>> https://www.kernel.org/doc/Documentation/cgroups/blkio-controller.txt
> >>> [2] ioprio_set http://man7.org/linux/man-pages/man2/ioprio_set.2.html
> >>>
> >>>
> >>> On Tue, Nov 12, 2013 at 1:38 AM, Steve Loughran <
> [EMAIL PROTECTED]
> >>>> wrote:
> >>>
> >>>> I've looked at it a bit within the context of YARN.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB