Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> YARN Pi example job stuck at 0%(No MR tasks are started by ResourceManager)


Copy link to this message
-
Re: YARN Pi example job stuck at 0%(No MR tasks are started by ResourceManager)
Can you share your yarn-site.xml contents? Have you tweaked memory
sizes in there?

On Fri, Jul 27, 2012 at 11:53 PM, anil gupta <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I have a Hadoop 2.0 alpha(cdh4)  hadoop/hbase cluster runnning on
> CentOS6.0. The cluster has 4 admin nodes and 8 data nodes. I have the RM
> and History server running on one machine. RM web interface shows that 8
> Nodes are connected to it. I installed this cluster with HA capability and
> I have already tested HA for Namenodes, ZK, HBase Master. I am running the
> pi example mapreduce job with user "root" and i have created "/user/root"
> directory in HDFS.
>
> Last few lines of one of the nodemanager:
> 2012-07-26 21:58:38,745 INFO org.mortbay.log: Extract
> jar:file:/usr/lib/hadoop-yarn/hadoop-yarn-common-2.0.0-cdh4.0.0.jar!/webapps/node
> to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
> 2012-07-26 21:58:38,907 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:8042
> 2012-07-26 21:58:38,907 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app
> /node started at 8042
> 2012-07-26 21:58:38,919 INFO org.apache.hadoop.yarn.webapp.WebApps:
> Registered webapp guice modules
> 2012-07-26 21:58:38,919 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is
> started.
> 2012-07-26 21:58:38,919 INFO
> org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is
> started.
> 2012-07-26 21:58:38,922 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Connected
> to ResourceManager at ihub-an-l1/172.31.192.151:8025
> 2012-07-26 21:58:38,924 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registered
> with ResourceManager as ihub-dn-l2:53199 with total resource of memory: 1200
> 2012-07-26 21:58:38,924 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl is
> started.
> 2012-07-26 21:58:38,929 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeManager is started.
> *2012-07-26 21:58:38,929 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl is
> stopped.*
>
> Why is the nodestatusupdaterImpl is stopped?
>
> Here is the last few lines of the RM:
> 2012-07-27 09:38:24,644 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated
> new applicationId: 2
> 2012-07-27 09:38:25,310 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application
> with id 2 submitted by user root
> 2012-07-27 09:38:25,310 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
> IP=172.31.192.51        OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1343365114818_0002
> 2012-07-27 09:38:25,310 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1343365114818_0002 State change from NEW to SUBMITTED
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Registering appattempt_1343365114818_0002_000001
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1343365114818_0002_000001 State change from NEW to SUBMITTED
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler:
> Application Submission: application_1343365114818_0002 from root, currently
> active: 1
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1343365114818_0002_000001 State change from SUBMITTED to
> SCHEDULED
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1343365114818_0002 State change from SUBMITTED to ACCEPTED
>
> The Pi example job is stuck from last 1 hour. Why it is not trying to start

Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB