Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: YARN MapReduce 2 concepts


Copy link to this message
-
Re: YARN MapReduce 2 concepts
Hi Mohit,
answers inline
On Fri, Sep 20, 2013 at 1:33 AM, Mohit Anchlia <[EMAIL PROTECTED]>wrote:

> I am going through the concepts of resource manager, application master
> and node manager. As I undersand resource manager receives the job
> submission and launches application master. It also launches node manager
> to monitor application master. My questions are:
>
> 1. Is Node manager long lived and that one node manager monitors all the
> containers launed on the data nodes?
>

Correct
> 2. How is resource negotiation done between the application master and the
> resource manager? In other words what happens during this step? Does
> resource manager looks at the active and pending tasks and resources
> consumed by those before giving containers to the application master?
>

The ResourceManager contains a pluggable scheduler that is responsible for
deciding which applications to give resources to when they become
available.  When a NodeManager heartbeats to the ResourceManager, the
scheduler will decide whether there are any containers it should place on
that node for an application, and will let the Application Master know
about its decision on the next AM-RM heartbeat.  Here's documentation for
the two recommended schedulers:
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> 3. As it happens in old map reduce cluster that task trackers sends
> periodic heartbeats to the job tracker nodes. How does this compare to
> YARN? It looks like application master is a task tracker? Little confused
> here.
>

The analog to this is the NodeManager sending periodic heartbeats to the
ResourceManager.  The Application Master also sends periodic heartbeats to
the NodeManagers that its containers are running on to check on their
status.
> 4. It looks like client polls application master to get the progress of
> the job but initially client connects to the resource manager. How does
> client gets reference to the application master? Does it mean that client
> gets the node ip/port from resource manager where application master was
> launced by the resource manager?
>

Correct