Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> worker affinity and YARN scheduling

Copy link to this message
worker affinity and YARN scheduling
I would like to better understand YARN's scheduling with named workers and relaxedLocality==true.  For example, suppose that I have a three-node cluster with nodes A,B,C.  Each node has capacity to run two tasks of the kind I desire simultaneously.  My AM then requests nine containers with worker-name set so that I am requesting three containers per worker.  The cluster starts idle and has no other users.  My questions:

*         Is it optimal to issue three ResourceRequests, each with numContainers==3?   (As opposed to nine requests)

*         Initially, I expect the RM to allocate two containers per node, and I expect to have the containers match the named workers.  Is this always the case?

*         If the first task completes on worker "B", can I rely on the ResourceRequest for "B" to be fulfilled next?

*         What techniques should be used to get the containers on the workers I expect most often?

*         What techniques should be used to reduce container allocation latency, if possible?