Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> communication path for task assignment in hadoop 2.X


Copy link to this message
-
Re: communication path for task assignment in hadoop 2.X
Hi again,

On Fri, Apr 12, 2013 at 9:36 AM, hari <[EMAIL PROTECTED]> wrote:
>>
>> Yes, the heartbeats between the NodeManager and the ResourceManager
>> does not account for container assignments anymore. Container launches
>> are handled by a separate NodeManager-embedded service called the
>> ContainerManager [1]
>
>
>
> Thanks. While I am currently going through the ContainerManger, I happened
> to notice that there is no concept of container slots similar to the task
> slots in the previous versions. Maybe now that there is Resourcemanager, it
> is
> controlling the number of containers that should be launched. Is that the
> case ?

Yes, each NM publishes the resources it has, and the RM keeps track of
how much is in use, etc.. The fixed number slot concept has gone,
replaced by looser, resource-demand based constraints.

>> > 2. Is there a different communication path for task assignment ?
>> > Is the scheduler making the remote calls or are there other classes
>> > outside
>> > of yarn responsible for making the remote calls ?
>>
>> The latter. Scheduler no longer is responsible for asking NodeManagers
>> to launch the containers. An ApplicationMaster just asks Scheduler to
>> reserve it some containers with specified resource (Memory/CPU/etc.)
>> allocations on available or preferred NodeManagers, and then once the
>> ApplicationMaster gets a response back that the allocation succeeded,
>> it manually communicates with ContainerManager on the allocated
>> NodeManager, to launch the command needed to run the 'task'.
>>
>> A good-enough example to read would be the DistributedShell example.
>> I've linked [3] to show where the above AppMaster -> ContainerManager
>> requesting happens in its ApplicationMaster implementation, which
>> should help clear this for you.
>>
>
> Thanks for the pointers. So, for mapreduce applications, is the MRAppMaster
> class the
> ApplicationMaster ?  MRAppMaster should then be responsible for asking
> Resourcemanager for containers and then launching those on Nodemanagers.
> Is that the case ? Also, do all mapreduce job submissions (using "hadoop
> jar" command)
>  use MRAppMaster as their ApplicationMaster ?

Yes to all of the above. Every MR job currently launches its own app-master.

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB