-Re: communication path for task assignment in hadoop 2.X
hari 2013-04-12, 04:06
> Note: New term instead of 'Task', in YARN, is 'Container'.
Thanks. The YARN guide is now more easier to read.
> Yes, the heartbeats between the NodeManager and the ResourceManager
> does not account for container assignments anymore. Container launches
> are handled by a separate NodeManager-embedded service called the
> ContainerManager 
Thanks. While I am currently going through the ContainerManger, I happened
to notice that there is no concept of container slots similar to the task
slots in the previous versions. Maybe now that there is Resourcemanager, it
controlling the number of containers that should be launched. Is that the
> > 2. Is there a different communication path for task assignment ?
> > Is the scheduler making the remote calls or are there other classes
> > of yarn responsible for making the remote calls ?
> The latter. Scheduler no longer is responsible for asking NodeManagers
> to launch the containers. An ApplicationMaster just asks Scheduler to
> reserve it some containers with specified resource (Memory/CPU/etc.)
> allocations on available or preferred NodeManagers, and then once the
> ApplicationMaster gets a response back that the allocation succeeded,
> it manually communicates with ContainerManager on the allocated
> NodeManager, to launch the command needed to run the 'task'.
> A good-enough example to read would be the DistributedShell example.
> I've linked  to show where the above AppMaster -> ContainerManager
> requesting happens in its ApplicationMaster implementation, which
> should help clear this for you.
Thanks for the pointers. So, for mapreduce applications, is the MRAppMaster
ApplicationMaster ? MRAppMaster should then be responsible for asking
Resourcemanager for containers and then launching those on Nodemanagers.
Is that the case ? Also, do all mapreduce job submissions (using "hadoop
use MRAppMaster as their ApplicationMaster ?