Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Yarn job stuck with no application master being assigned


Copy link to this message
-
Re: Yarn job stuck with no application master being assigned
Siddhi,

On Jun 21, 2013, at 6:07 PM, Siddhi Mehta <[EMAIL PROTECTED]> wrote:

> That solved the problem. Thanks Sandy!!
>
> What is the optimal setting for yarn.scheduler.capacity.maximum-am-resource-percent in terms of node manager.
> What are the consequences of setting to a higher value?

This means that more AMs will be active concurrently.

One thing to remember: in terms of getting *real* work done an AM is, kinda, pure overhead (currently) in the sense that it does not do actual data-processing - this is true of the MR AM; but really depends on how the AM is implemented. An AM *may* choose to do some actual work of course - depends on the implementation.

With that context, If you have a very small cluster, then too many containers might be used for running AMs with higher values for yarn.scheduler.capacity.maximum-am-resource-percent and overall utilization might go low. As a result, you want to be aware of this.
> Also, I noticed that by default application master needs 1.5GB. Are there any side effects we will face if I lower that to 1GB
I have tried AMs with as low as 200M for small jobs. It really depends on how many tasks you want your job to manage.

Arun

>
> Siddhi
>
>
> On Fri, Jun 21, 2013 at 4:28 PM, Sandy Ryza <[EMAIL PROTECTED]> wrote:
> Hi Siddhi,
>
> Moving this question to the CDH list.
>
> Does setting yarn.scheduler.capacity.maximum-am-resource-percent to .5 help?
>
> Have you tried using the Fair Scheduler?
>
> -Sandy
>
>
> On Fri, Jun 21, 2013 at 4:21 PM, Siddhi Mehta <[EMAIL PROTECTED]> wrote:
> Hey All,
>
> I am running a Hadoop 2.0(cdh4.2.1) cluster on a single node with 1 NodeManager.
>
> We have an Map only job that launches a pig job on the cluster(similar to what oozie does)
>
> We are seeing that the map only job launches the pig script but the pig job is stuck in ACCEPTED state with no trackingUI assigned.
>
> I dont see any error in the nodemanager logs or the resource manager logs as such.
>
>
> On the nodemanager i see this logs
> 2013-06-21 15:05:13,084 INFO  capacity.ParentQueue - assignedContainer queue=root usedCapacity=0.4 absoluteUsedCapacity=0.4 used=memory: 2048 cluster=memory: 5120
>
> 2013-06-21 15:05:38,898 INFO  capacity.CapacityScheduler - Application Submission: appattempt_1371850881510_0003_000001, user: smehta queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=2048MB, usedCapacity=0.4, absoluteUsedCapacity=0.4, numApps=2, numContainers=2, currently active: 2
>
> Which suggests that the cluster has capacity but still no application master is assigned to it.
> What am I missing?Any help is appreciated.
>
> I keep seeing this logs on the node manager
> 2013-06-21 16:19:37,675 INFO  monitor.ContainersMonitorImpl - Memory usage of ProcessTree 12484 for container-id container_1371850881510_0002_01_000002: 157.1mb of 1.0gb physical memory used; 590.1mb of 2.1gb virtual memory used
> 2013-06-21 16:19:37,696 INFO  monitor.ContainersMonitorImpl - Memory usage of ProcessTree 12009 for container-id container_1371850881510_0002_01_000001: 181.0mb of 1.0gb physical memory used; 1.4gb of 2.1gb virtual memory used
> 2013-06-21 16:19:37,946 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "", exit_status: -1000,
> 2013-06-21 16:19:37,946 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 2, }, state: C_RUNNING, diagnostics: "", exit_status: -1000,
> 2013-06-21 16:19:38,948 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "", exit_status: -1000,

Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB