Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> NodeStatusUpdaterImpl is stopped whenever a Yarn Job is Run.


Copy link to this message
-
NodeStatusUpdaterImpl is stopped whenever a Yarn Job is Run.
Hi All,

I have a fully distributed hadoop-2.0.0 alpha cluster(cdh4). Each node in
the cluster had 3.2 GB of RAM running on CentOS6.0. Since the memory in
each node is less so i modified the yarn-site.xml and mapred-site.xml to
run with lesser memory . Here is the mapred-site.xml:
http://pastebin.com/Fxjie6kg and yarn-site.xml: http://pastebin.com/TCJuDAhe.
If i run a job then i get the following message in log file of a
nodemanager:http://pastebin.com/d8tsBA2a

On the console i would get the following error:
[root@ihub-nn-a1 ~]# hadoop --config /etc/hadoop/conf/ jar
/usr/lib/hadoop-mapreduce/hadoop-*-examples.jar pi 10 100000
Number of Maps  = 10
Samples per Map = 100000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Starting Job
12/07/31 00:15:12 INFO input.FileInputFormat: Total input paths to process
: 10
12/07/31 00:15:12 INFO mapreduce.JobSubmitter: number of splits:10
12/07/31 00:15:12 WARN conf.Configuration: mapred.jar is deprecated.
Instead, use mapreduce.job.jar
12/07/31 00:15:12 WARN conf.Configuration:
mapred.map.tasks.speculative.execution is deprecated. Instead, use
mapreduce.map.speculative
12/07/31 00:15:12 WARN conf.Configuration: mapred.reduce.tasks is
deprecated. Instead, use mapreduce.job.reduces
12/07/31 00:15:12 WARN conf.Configuration: mapred.output.value.class is
deprecated. Instead, use mapreduce.job.output.value.class
12/07/31 00:15:12 WARN conf.Configuration:
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
mapreduce.reduce.speculative
12/07/31 00:15:12 WARN conf.Configuration: mapreduce.map.class is
deprecated. Instead, use mapreduce.job.map.class
12/07/31 00:15:12 WARN conf.Configuration: mapred.job.name is deprecated.
Instead, use mapreduce.job.name
12/07/31 00:15:12 WARN conf.Configuration: mapreduce.reduce.class is
deprecated. Instead, use mapreduce.job.reduce.class
12/07/31 00:15:12 WARN conf.Configuration: mapreduce.inputformat.class is
deprecated. Instead, use mapreduce.job.inputformat.class
12/07/31 00:15:12 WARN conf.Configuration: mapred.input.dir is deprecated.
Instead, use mapreduce.input.fileinputformat.inputdir
12/07/31 00:15:12 WARN conf.Configuration: mapred.output.dir is deprecated.
Instead, use mapreduce.output.fileoutputformat.outputdir
12/07/31 00:15:12 WARN conf.Configuration: mapreduce.outputformat.class is
deprecated. Instead, use mapreduce.job.outputformat.class
12/07/31 00:15:12 WARN conf.Configuration: mapred.map.tasks is deprecated.
Instead, use mapreduce.job.maps
12/07/31 00:15:12 WARN conf.Configuration: mapred.output.key.class is
deprecated. Instead, use mapreduce.job.output.key.class
12/07/31 00:15:12 WARN conf.Configuration: mapred.working.dir is
deprecated. Instead, use mapreduce.job.working.dir
12/07/31 00:15:12 INFO mapred.ResourceMgrDelegate: Submitted application
application_1343717845091_0003 to ResourceManager at ihub-an-l1/
172.31.192.151:8040
12/07/31 00:15:12 INFO mapreduce.Job: The url to track the job:
http://ihub-an-l1:9999/proxy/application_1343717845091_0003/
12/07/31 00:15:12 INFO mapreduce.Job: Running job: job_1343717845091_0003
12/07/31 00:15:18 INFO mapreduce.Job: Job job_1343717845091_0003 running in
uber mode : false
12/07/31 00:15:18 INFO mapreduce.Job:  map 0% reduce 0%
12/07/31 00:15:18 INFO mapreduce.Job: Job job_1343717845091_0003 failed
with state FAILED due to: Application application_1343717845091_0003 failed
1 times due to AM Container for appattempt_1343717845091_0003_000001 exited
with  exitCode: 1 due to:
.Failing this attempt.. Failing the application.
12/07/31 00:15:18 INFO mapreduce.Job: Counters: 0
Job Finished in 6.898 seconds
java.io.FileNotFoundException: File does not exist:
hdfs://ihubcluster/user/root/QuasiMonteCarlo_TMP_3_141592654/out/reduce-out
        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1685)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1709)
        at
org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
        at
org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:351)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at
org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:360)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
        at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

Does anyone has any idea why the NodeStatusUpdaterImpl is stopped whenever
are task is started by NodeManager?
Does anyone knows what does "exit_status: -1000" means in the NodeManager
logs? Is there are way to figure out the problem with error code?

I strongly feel that there is a major bug in Yarn when we try to run it
with lesser memory. I have a already identified one a couple of days ago(
http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/33110)

Thanks & Regards,
Anil Gupta
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB