Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> communication path for task assignment in hadoop 2.X


+
hari 2013-04-11, 06:21
+
Harsh J 2013-04-11, 12:56
+
hari 2013-04-12, 04:06
Copy link to this message
-
Re: communication path for task assignment in hadoop 2.X
Hi again,

On Fri, Apr 12, 2013 at 9:36 AM, hari <[EMAIL PROTECTED]> wrote:
>>
>> Yes, the heartbeats between the NodeManager and the ResourceManager
>> does not account for container assignments anymore. Container launches
>> are handled by a separate NodeManager-embedded service called the
>> ContainerManager [1]
>
>
>
> Thanks. While I am currently going through the ContainerManger, I happened
> to notice that there is no concept of container slots similar to the task
> slots in the previous versions. Maybe now that there is Resourcemanager, it
> is
> controlling the number of containers that should be launched. Is that the
> case ?

Yes, each NM publishes the resources it has, and the RM keeps track of
how much is in use, etc.. The fixed number slot concept has gone,
replaced by looser, resource-demand based constraints.

>> > 2. Is there a different communication path for task assignment ?
>> > Is the scheduler making the remote calls or are there other classes
>> > outside
>> > of yarn responsible for making the remote calls ?
>>
>> The latter. Scheduler no longer is responsible for asking NodeManagers
>> to launch the containers. An ApplicationMaster just asks Scheduler to
>> reserve it some containers with specified resource (Memory/CPU/etc.)
>> allocations on available or preferred NodeManagers, and then once the
>> ApplicationMaster gets a response back that the allocation succeeded,
>> it manually communicates with ContainerManager on the allocated
>> NodeManager, to launch the command needed to run the 'task'.
>>
>> A good-enough example to read would be the DistributedShell example.
>> I've linked [3] to show where the above AppMaster -> ContainerManager
>> requesting happens in its ApplicationMaster implementation, which
>> should help clear this for you.
>>
>
> Thanks for the pointers. So, for mapreduce applications, is the MRAppMaster
> class the
> ApplicationMaster ?  MRAppMaster should then be responsible for asking
> Resourcemanager for containers and then launching those on Nodemanagers.
> Is that the case ? Also, do all mapreduce job submissions (using "hadoop
> jar" command)
>  use MRAppMaster as their ApplicationMaster ?

Yes to all of the above. Every MR job currently launches its own app-master.

--
Harsh J