Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> communication path for task assignment in hadoop 2.X


Copy link to this message
-
Re: communication path for task assignment in hadoop 2.X
Hi Hari,

My response inline.

On Thu, Apr 11, 2013 at 11:51 AM, hari <[EMAIL PROTECTED]> wrote:
> Hi list,
>
> I was looking at the node communication (in hadoop-2.0.3-alpha)
> responsible for task assignment. So
> far I saw that the resourcemanager and nodemanager
> exchange heartbeat request/response.

Very glad you're checking out YARN. Do provide feedback on what we can
improve over the YARN JIRA when you get to try it!

> I have a couple of questions on this:
>
> 1. I saw that the heartbeat response to nodemanager does
> not include new task assignment. I think in the previous
> versions (eg. 0.20.2) the task assignment was piggybacked in the
> heartbeat response to the tasktracker. In the v2.0.3, so far
> I could only see that the response can trigger nodemanager
> shutdown, reboot, app cleanup, and container cleanup. Are there
> any other actions being triggered on the nodemanager
> by the heartbeat response ?

Note: New term instead of 'Task', in YARN, is 'Container'.

Yes, the heartbeats between the NodeManager and the ResourceManager
does not account for container assignments anymore. Container launches
are handled by a separate NodeManager-embedded service called the
ContainerManager [1].

You can read all the functions a heartbeat currently does at a
NodeManager under its service NodeStatusUpdater code at [2].

> 2. Is there a different communication path for task assignment ?
> Is the scheduler making the remote calls or are there other classes outside
> of yarn responsible for making the remote calls ?

The latter. Scheduler no longer is responsible for asking NodeManagers
to launch the containers. An ApplicationMaster just asks Scheduler to
reserve it some containers with specified resource (Memory/CPU/etc.)
allocations on available or preferred NodeManagers, and then once the
ApplicationMaster gets a response back that the allocation succeeded,
it manually communicates with ContainerManager on the allocated
NodeManager, to launch the command needed to run the 'task'.

A good-enough example to read would be the DistributedShell example.
I've linked [3] to show where the above AppMaster -> ContainerManager
requesting happens in its ApplicationMaster implementation, which
should help clear this for you.

[1] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java#L392
[startContainer, stopContainer, getContainerStatus etc… are all
protocol-callable calls from an Application Master]
[2] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java#L433
[Link is to the heartbeat retry loop, and then the response processing
follows right after the loop]
[3] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java#L756
[See the whole method, i.e. above from this code line point which is
just the final RPC call, and you can notice how we build the
command/environment to launch our 'task']
--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB