-worker affinity and YARN scheduling
John Lilley 2013-11-11, 14:28
I would like to better understand YARN's scheduling with named workers and relaxedLocality==true. For example, suppose that I have a three-node cluster with nodes A,B,C. Each node has capacity to run two tasks of the kind I desire simultaneously. My AM then requests nine containers with worker-name set so that I am requesting three containers per worker. The cluster starts idle and has no other users. My questions:
* Is it optimal to issue three ResourceRequests, each with numContainers==3? (As opposed to nine requests)
* Initially, I expect the RM to allocate two containers per node, and I expect to have the containers match the named workers. Is this always the case?
* If the first task completes on worker "B", can I rely on the ResourceRequest for "B" to be fulfilled next?
* What techniques should be used to get the containers on the workers I expect most often?
* What techniques should be used to reduce container allocation latency, if possible?