Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - YARN Pi example job stuck at 0%(No MR tasks are started by ResourceManager)


Copy link to this message
-
Re: YARN Pi example job stuck at 0%(No MR tasks are started by ResourceManager)
Harsh J 2012-07-27, 21:23
Can you share your yarn-site.xml contents? Have you tweaked memory
sizes in there?

On Fri, Jul 27, 2012 at 11:53 PM, anil gupta <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I have a Hadoop 2.0 alpha(cdh4)  hadoop/hbase cluster runnning on
> CentOS6.0. The cluster has 4 admin nodes and 8 data nodes. I have the RM
> and History server running on one machine. RM web interface shows that 8
> Nodes are connected to it. I installed this cluster with HA capability and
> I have already tested HA for Namenodes, ZK, HBase Master. I am running the
> pi example mapreduce job with user "root" and i have created "/user/root"
> directory in HDFS.
>
> Last few lines of one of the nodemanager:
> 2012-07-26 21:58:38,745 INFO org.mortbay.log: Extract
> jar:file:/usr/lib/hadoop-yarn/hadoop-yarn-common-2.0.0-cdh4.0.0.jar!/webapps/node
> to /tmp/Jetty_0_0_0_0_8042_node____19tj0x/webapp
> 2012-07-26 21:58:38,907 INFO org.mortbay.log: Started
> SelectChannelConnector@0.0.0.0:8042
> 2012-07-26 21:58:38,907 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app
> /node started at 8042
> 2012-07-26 21:58:38,919 INFO org.apache.hadoop.yarn.webapp.WebApps:
> Registered webapp guice modules
> 2012-07-26 21:58:38,919 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.webapp.WebServer is
> started.
> 2012-07-26 21:58:38,919 INFO
> org.apache.hadoop.yarn.service.AbstractService: Service:Dispatcher is
> started.
> 2012-07-26 21:58:38,922 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Connected
> to ResourceManager at ihub-an-l1/172.31.192.151:8025
> 2012-07-26 21:58:38,924 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registered
> with ResourceManager as ihub-dn-l2:53199 with total resource of memory: 1200
> 2012-07-26 21:58:38,924 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl is
> started.
> 2012-07-26 21:58:38,929 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeManager is started.
> *2012-07-26 21:58:38,929 INFO
> org.apache.hadoop.yarn.service.AbstractService:
> Service:org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl is
> stopped.*
>
> Why is the nodestatusupdaterImpl is stopped?
>
> Here is the last few lines of the RM:
> 2012-07-27 09:38:24,644 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated
> new applicationId: 2
> 2012-07-27 09:38:25,310 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application
> with id 2 submitted by user root
> 2012-07-27 09:38:25,310 INFO
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root
> IP=172.31.192.51        OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1343365114818_0002
> 2012-07-27 09:38:25,310 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1343365114818_0002 State change from NEW to SUBMITTED
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
> Registering appattempt_1343365114818_0002_000001
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1343365114818_0002_000001 State change from NEW to SUBMITTED
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler:
> Application Submission: application_1343365114818_0002 from root, currently
> active: 1
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
> appattempt_1343365114818_0002_000001 State change from SUBMITTED to
> SCHEDULED
> 2012-07-27 09:38:25,311 INFO
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl:
> application_1343365114818_0002 State change from SUBMITTED to ACCEPTED
>
> The Pi example job is stuck from last 1 hour. Why it is not trying to start

Harsh J