|
Patrick Wendell
2011-12-10, 20:23
Todd Lipcon
2011-12-11, 00:21
Robert Evans
2011-12-12, 16:44
Arun C Murthy
2011-12-12, 21:27
Patrick Wendell
2011-12-13, 03:50
Arun C Murthy
2011-12-13, 06:42
Robert Evans
2011-12-13, 15:32
|
-
Multiple resource requests for a given node (or all nodes)?Patrick Wendell 2011-12-10, 20:23
If you look at how resource requests are stored now, they use a map
keyed on the node hostname. == AppSchedulingInfo.java = final Map<Priority, Map<String, ResourceRequest>> requests new HashMap<Priority, Map<String, ResourceRequest>>(); ======= What happens if an application wants to request multiple container types on a given node. E.g. say I need 10 2GB containers and 10 1GB containers, and I don't care which node they are on (i.e. RMNode.ANY). I really want to store 2 resource requests under RMNode.ANY in this case... don't I? Is the model just that an AM would ask for these in series? - Patrick
-
Re: Multiple resource requests for a given node (or all nodes)?Todd Lipcon 2011-12-11, 00:21
On Sat, Dec 10, 2011 at 12:23 PM, Patrick Wendell
<[EMAIL PROTECTED]> wrote: > What happens if an application wants to request multiple container > types on a given node. E.g. say I need 10 2GB containers and 10 1GB > containers, and I don't care which node they are on (i.e. RMNode.ANY). > I really want to store 2 resource requests under RMNode.ANY in this > case... don't I? > > Is the model just that an AM would ask for these in series? My hunch is that this was overlooked because the resource sizes for MR are basically set on a per-task-type level. That is, maps need X MB and reduces need Y MB. Since maps and reduces are set at different 'priorities', they haven't conflicted. Does it seem straightforward to change it to a multimap? Guava has a nice implementation. -Todd -- Todd Lipcon Software Engineer, Cloudera
-
Re: Multiple resource requests for a given node (or all nodes)?Robert Evans 2011-12-12, 16:44
I think there may be some need for a bigger redesign in how requests are made to the scheduler because the only use case really was map/reduce at the time it was designed. It works very well for that purpose but has missed a few other use cases. For example there could be something like HBase where it wants a specific number of nodes with no overlap on the same physical machines (Yes you can do it now but it may take many iterations to get it right). Or perhaps like with MPI or Storm where they don't really care where the nodes are so long as they are all relatively close to one another in the network topology. Or things like with MPI where it cannot start any processing until all of the containers are ready (gang scheduling).
It gets even more complicated if we want to support preemption like with the fair scheduler. Which imo is needed even more once MPI and other potentially very long lived jobs start to coexist with shorter jobs with tight SLAs. In order to make a good decision about what to preempt the scheduler needs to know that if it preempts a mapper, even though it may have been running a lot shorter time then some reducer in the same application it is likely to slow things down further then if it preempts that reducer. Or if it preempts an MPI node it might was well kill the entire application and start over, unless we some how give the scheduler the ability to tell MPI that it is going to be preempted and it needs to save its state away. But even then the scheduler needs to know that preempting an MPI job will cause all progress on it, and all of the containers it is holding, to stop. Even if we are not putting any of these scheduling features in now we need to think about them when designing the interface to not limit ourselves and force us to change things drastically later on. I am just saying that I am not sure just switching to a multimap is enough. -- Bobby Evans On 12/10/11 6:21 PM, "Todd Lipcon" <[EMAIL PROTECTED]> wrote: On Sat, Dec 10, 2011 at 12:23 PM, Patrick Wendell <[EMAIL PROTECTED]> wrote: > What happens if an application wants to request multiple container > types on a given node. E.g. say I need 10 2GB containers and 10 1GB > containers, and I don't care which node they are on (i.e. RMNode.ANY). > I really want to store 2 resource requests under RMNode.ANY in this > case... don't I? > > Is the model just that an AM would ask for these in series? My hunch is that this was overlooked because the resource sizes for MR are basically set on a per-task-type level. That is, maps need X MB and reduces need Y MB. Since maps and reduces are set at different 'priorities', they haven't conflicted. Does it seem straightforward to change it to a multimap? Guava has a nice implementation. -Todd -- Todd Lipcon Software Engineer, Cloudera
-
Re: Multiple resource requests for a given node (or all nodes)?Arun C Murthy 2011-12-12, 21:27
Use priorities to ask for different resource types.
Arun On Dec 10, 2011, at 12:23 PM, Patrick Wendell wrote: > If you look at how resource requests are stored now, they use a map > keyed on the node hostname. > > == AppSchedulingInfo.java => > final Map<Priority, Map<String, ResourceRequest>> requests > new HashMap<Priority, Map<String, ResourceRequest>>(); > > =======> > What happens if an application wants to request multiple container > types on a given node. E.g. say I need 10 2GB containers and 10 1GB > containers, and I don't care which node they are on (i.e. RMNode.ANY). > I really want to store 2 resource requests under RMNode.ANY in this > case... don't I? > > Is the model just that an AM would ask for these in series? > > - Patrick
-
Re: Multiple resource requests for a given node (or all nodes)?Patrick Wendell 2011-12-13, 03:50
Todd - that's a good question and I haven't looked closely into
whether simply adding a multimap is enough or if there are more deeply seeded issues (at least to address this specific case). If it's the former I'll probably just submit a patch. Arun - that seems like a hack but I guess it is a sufficient workaround for current applications. I'm finishing up a bare-bones version of the Fair Scheduler right now (going to throw something up for review soon) but I haven't yet added preemption. How this is going to work well with various types of applications is unclear. In the MR case we can probably just preempt based on priorities, since they are essentially just ordering constraints right now. As Robert points out, this interface is very MR-Centric right now - i'm not sure this generalizes well to other applications depending on how they use priorities. - Patrick On Mon, Dec 12, 2011 at 1:27 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote: > Use priorities to ask for different resource types. > > Arun > > On Dec 10, 2011, at 12:23 PM, Patrick Wendell wrote: > >> If you look at how resource requests are stored now, they use a map >> keyed on the node hostname. >> >> == AppSchedulingInfo.java =>> >> final Map<Priority, Map<String, ResourceRequest>> requests >> new HashMap<Priority, Map<String, ResourceRequest>>(); >> >> =======>> >> What happens if an application wants to request multiple container >> types on a given node. E.g. say I need 10 2GB containers and 10 1GB >> containers, and I don't care which node they are on (i.e. RMNode.ANY). >> I really want to store 2 resource requests under RMNode.ANY in this >> case... don't I? >> >> Is the model just that an AM would ask for these in series? >> >> - Patrick >
-
Re: Multiple resource requests for a given node (or all nodes)?Arun C Murthy 2011-12-13, 06:42
I'd argue that Robert is complaining that the interface *is not* MR-centric enough.
IAC, priorities is fairly generic. MR AM uses it to get constraints to stick. Arun On Dec 12, 2011, at 7:50 PM, Patrick Wendell wrote: > Todd - that's a good question and I haven't looked closely into > whether simply adding a multimap is enough or if there are more deeply > seeded issues (at least to address this specific case). If it's the > former I'll probably just submit a patch. > > Arun - that seems like a hack but I guess it is a sufficient > workaround for current applications. > > I'm finishing up a bare-bones version of the Fair Scheduler right now > (going to throw something up for review soon) but I haven't yet added > preemption. How this is going to work well with various types of > applications is unclear. In the MR case we can probably just preempt > based on priorities, since they are essentially just ordering > constraints right now. As Robert points out, this interface is very > MR-Centric right now - i'm not sure this generalizes well to other > applications depending on how they use priorities. > > - Patrick > > On Mon, Dec 12, 2011 at 1:27 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote: >> Use priorities to ask for different resource types. >> >> Arun >> >> On Dec 10, 2011, at 12:23 PM, Patrick Wendell wrote: >> >>> If you look at how resource requests are stored now, they use a map >>> keyed on the node hostname. >>> >>> == AppSchedulingInfo.java =>>> >>> final Map<Priority, Map<String, ResourceRequest>> requests >>> new HashMap<Priority, Map<String, ResourceRequest>>(); >>> >>> =======>>> >>> What happens if an application wants to request multiple container >>> types on a given node. E.g. say I need 10 2GB containers and 10 1GB >>> containers, and I don't care which node they are on (i.e. RMNode.ANY). >>> I really want to store 2 resource requests under RMNode.ANY in this >>> case... don't I? >>> >>> Is the model just that an AM would ask for these in series? >>> >>> - Patrick >>
-
Re: Multiple resource requests for a given node (or all nodes)?Robert Evans 2011-12-13, 15:32
Arun,
I am saying that I don't know what the correct solution is to updating the scheduler interface. Perhaps the correct solution is no change, I have not taken the time to think about it much. What I am saying is that there are a number of new features that are likely going to be going into the scheduler, and if we are going to change the interface, I want to be sure that we think about these use cases before we change it. That is all I am saying. I am not advocating for a particular interface at this point, as I said I have not taken the time to think about it in depth. --Bobby Evans On 12/13/11 12:42 AM, "Arun C Murthy" <[EMAIL PROTECTED]> wrote: I'd argue that Robert is complaining that the interface *is not* MR-centric enough. IAC, priorities is fairly generic. MR AM uses it to get constraints to stick. Arun On Dec 12, 2011, at 7:50 PM, Patrick Wendell wrote: > Todd - that's a good question and I haven't looked closely into > whether simply adding a multimap is enough or if there are more deeply > seeded issues (at least to address this specific case). If it's the > former I'll probably just submit a patch. > > Arun - that seems like a hack but I guess it is a sufficient > workaround for current applications. > > I'm finishing up a bare-bones version of the Fair Scheduler right now > (going to throw something up for review soon) but I haven't yet added > preemption. How this is going to work well with various types of > applications is unclear. In the MR case we can probably just preempt > based on priorities, since they are essentially just ordering > constraints right now. As Robert points out, this interface is very > MR-Centric right now - i'm not sure this generalizes well to other > applications depending on how they use priorities. > > - Patrick > > On Mon, Dec 12, 2011 at 1:27 PM, Arun C Murthy <[EMAIL PROTECTED]> wrote: >> Use priorities to ask for different resource types. >> >> Arun >> >> On Dec 10, 2011, at 12:23 PM, Patrick Wendell wrote: >> >>> If you look at how resource requests are stored now, they use a map >>> keyed on the node hostname. >>> >>> == AppSchedulingInfo.java =>>> >>> final Map<Priority, Map<String, ResourceRequest>> requests >>> new HashMap<Priority, Map<String, ResourceRequest>>(); >>> >>> =======>>> >>> What happens if an application wants to request multiple container >>> types on a given node. E.g. say I need 10 2GB containers and 10 1GB >>> containers, and I don't care which node they are on (i.e. RMNode.ANY). >>> I really want to store 2 resource requests under RMNode.ANY in this >>> case... don't I? >>> >>> Is the model just that an AM would ask for these in series? >>> >>> - Patrick >> |