I am going through the concepts of resource manager, application master and
node manager. As I undersand resource manager receives the job submission
and launches application master. It also launches node manager to monitor
application master. My questions are:
1. Is Node manager long lived and that one node manager monitors all the
containers launed on the data nodes?
2. How is resource negotiation done between the application master and the
resource manager? In other words what happens during this step? Does
resource manager looks at the active and pending tasks and resources
consumed by those before giving containers to the application master?
3. As it happens in old map reduce cluster that task trackers sends
periodic heartbeats to the job tracker nodes. How does this compare to
YARN? It looks like application master is a task tracker? Little confused
4. It looks like client polls application master to get the progress of the
job but initially client connects to the resource manager. How does client
gets reference to the application master? Does it mean that client gets the
node ip/port from resource manager where application master was launced by
the resource manager?