|
Shuja Rehman
2010-07-10, 17:40
Shuja Rehman
2010-07-10, 17:48
Alex Kozlov
2010-07-10, 19:30
Shuja Rehman
2010-07-10, 19:59
Alex Kozlov
2010-07-10, 22:44
Shuja Rehman
2010-07-12, 08:32
Patrick Angeles
2010-07-12, 13:12
Shuja Rehman
2010-07-12, 13:29
Alex Kozlov
2010-07-12, 15:51
Shuja Rehman
2010-07-12, 19:20
Alex Kozlov
2010-07-12, 20:01
Shuja Rehman
2010-07-12, 20:24
Alex Kozlov
2010-07-12, 20:34
Shuja Rehman
2010-07-12, 21:08
Alex Kozlov
2010-07-12, 21:57
Shuja Rehman
2010-07-12, 22:06
Shuja Rehman
2010-07-12, 23:53
Ted Yu
2010-07-13, 00:49
Alex Kozlov
2010-07-13, 02:07
Shuja Rehman
2010-07-13, 11:37
|
-
java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-10, 17:40
Hi All
I am facing a hard problem. I am running a map reduce job using streaming but it fails and it gives the following error. Caught: java.lang.OutOfMemoryError: Java heap space at Nodemapper5.parseXML(Nodemapper5.groovy:25) java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) I have increased the heap size in hadoop-env.sh and make it 2000M. Also I tell the job manually by following line. -D mapred.child.java.opts=-Xmx2000M \ but it still gives the error. The same job runs fine if i run on shell using 1024M heap size like cat file.xml | /root/Nodemapper5.groovy Any clue????????? Thanks in advance. -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-10, 17:48
Hi All
I am facing a hard problem. I am running a map reduce job using streaming but it fails and it gives the following error. Caught: java.lang.OutOfMemoryError: Java heap space at Nodemapper5.parseXML(Nodemapper5.groovy:25) java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572) at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) I have increased the heap size in hadoop-env.sh and make it 2000M. Also I tell the job manually by following line. -D mapred.child.java.opts=-Xmx2000M \ but it still gives the error. The same job runs fine if i run on shell using 1024M heap size like cat file.xml | /root/Nodemapper5.groovy Any clue????????? Thanks in advance. -- Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spaceAlex Kozlov 2010-07-10, 19:30
Hi Shuja,
It looks like the OOM is happening in your code. Are you running MapReduce in a cluster? If so, can you send the exact command line your code is invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy' command on one of the nodes which is running the task? Thanks, Alex K On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <[EMAIL PROTECTED]>wrote: > Hi All > > I am facing a hard problem. I am running a map reduce job using streaming > but it fails and it gives the following error. > > Caught: java.lang.OutOfMemoryError: Java heap space > at Nodemapper5.parseXML(Nodemapper5.groovy:25) > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess > failed with code 1 > at > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) > at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572) > > at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) > at > org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > I have increased the heap size in hadoop-env.sh and make it 2000M. Also I > tell the job manually by following line. > > -D mapred.child.java.opts=-Xmx2000M \ > > but it still gives the error. The same job runs fine if i run on shell > using > 1024M heap size like > > cat file.xml | /root/Nodemapper5.groovy > > > Any clue????????? > > Thanks in advance. > > -- > Regards > Shuja-ur-Rehman Baig > _________________________________ > MS CS - School of Science and Engineering > Lahore University of Management Sciences (LUMS) > Sector U, DHA, Lahore, 54792, Pakistan > Cell: +92 3214207445 >
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-10, 19:59
Hi Alex
Yeah, I am running a job on cluster of 2 machines and using Cloudera distribution of hadoop. and here is the output of this command. root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20 -Dhadoop.id.str= -Dhado op.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/conf:/usr/ jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1 .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0. 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0 .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api -2.1.jar org.apache.hadoop.util.RunJar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML=" http://www.w3.org/TR/REC-xml">,end=</mdc> -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531 .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1 -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper /home/ftpuser1/Nodemapper5.groovy -reducer org.apache.hadoop.mapred.lib.IdentityReducer -file / home/ftpuser1/Nodemapper5.groovy root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy ------------------------------------------------------------------------------------------------------------------------------ and what is meant by OOM and thanks for helping, Best Regards On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > Hi Shuja, > > It looks like the OOM is happening in your code. Are you running MapReduce > in a cluster? If so, can you send the exact command line your code is > invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy' > command on one of the nodes which is running the task? > > Thanks, > > Alex K > > On Sat, Jul 10, 2010 at 10:40 AM, Shuja Rehman <[EMAIL PROTECTED] > >wrote: > > > Hi All > > > > I am facing a hard problem. I am running a map reduce job using streaming > > but it fails and it gives the following error. > > > > Caught: java.lang.OutOfMemoryError: Java heap space > > at Nodemapper5.parseXML(Nodemapper5.groovy:25) > > > > java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess > > failed with code 1 > > at > > > org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) > > at > > > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572) Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spaceAlex Kozlov 2010-07-10, 22:44
Hi Shuja,
First, thank you for using CDH3. Can you also check what m* apred.child.ulimit* you are using? Try adding "* -D mapred.child.ulimit=3145728*" to the command line. I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum, which you can download from the Java SE Homepage<http://java.sun.com/javase/downloads/index.jsp> . Let me know how it goes. Alex K On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <[EMAIL PROTECTED]>wrote: > Hi Alex > > Yeah, I am running a job on cluster of 2 machines and using Cloudera > distribution of hadoop. and here is the output of this command. > > root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20 > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console > -Dhadoop.policy.file=hadoop-policy.xml -classpath > /usr/lib/hadoop-0.20/conf:/usr/ > > jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2+320.jar:/usr/lib/hadoo > > p-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/common > > s-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1 > > .0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.ja > > r:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/l > > ib/hadoop-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/h > > adoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackso > > n-mapper-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-ru > > ntime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.14.jar:/usr/lib > > /hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0. > > 2.2.jar:/usr/lib/hadoop-0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0.20/lib > > /log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/mysql-connector-jav > > a-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/u > > sr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0 > > .20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api > -2.1.jar org.apache.hadoop.util.RunJar > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar > -D mapred.child.java.opts=-Xmx2000M -inputformat StreamInputFormat > -inputreader StreamXmlRecordReader,begin= <mdc xmlns:HTML=" > http://www.w3.org/TR/REC-xml">,end=</mdc> -input > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531 > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1 > -jobconf mapred.reduce.tasks=0 -output RNC11 -mapper > /home/ftpuser1/Nodemapper5.groovy -reducer > org.apache.hadoop.mapred.lib.IdentityReducer -file / > home/ftpuser1/Nodemapper5.groovy > root 5360 5074 0 12:51 pts/1 00:00:00 grep Nodemapper5.groovy > > > > ------------------------------------------------------------------------------------------------------------------------------ > and what is meant by OOM and thanks for helping, > > Best Regards > > > On Sun, Jul 11, 2010 at 12:30 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > > > Hi Shuja, > > > > It looks like the OOM is happening in your code. Are you running > MapReduce > > in a cluster? If so, can you send the exact command line your code is > > invoked with -- you can get it with a 'ps -Af | grep Nodemapper5.groovy' > > command on one of the nodes which is running the task? > > > > Thanks, > > > > Alex K
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-12, 08:32
Hi Alex,
I have update the java to latest available version on all machines in the cluster and now i run the job by adding this line -D mapred.child.ulimit=3145728 \ but still same error. Here is the output of this job. root 7845 5674 3 01:24 pts/1 00:00:00 /usr/jdk1.6.0_03/bin/java -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20 -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/con f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1 .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1. 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson- core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0. 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/ hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro- 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0. 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar org.apache.hadoop.util.RunJar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728 -inputformat StreamIn putFormat -inputreader StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w 3.org/TR/REC-xml">,end=</mdc> -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531 .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1 -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper /home/ftpuser1/Nodemapper5.groovy -re ducer org.apache.hadoop.mapred.lib.IdentityReducer -file /home/ftpuser1/Nodemapp er5.groovy root 7930 7632 0 01:24 pts/2 00:00:00 grep Nodemapper5.groovy Any clue? Thanks On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > Hi Shuja, > > First, thank you for using CDH3. Can you also check what m* > apred.child.ulimit* you are using? Try adding "* > -D mapred.child.ulimit=3145728*" to the command line. > > I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum, > which you can download from the Java SE > Homepage<http://java.sun.com/javase/downloads/index.jsp> > . > > Let me know how it goes. > > Alex K > > On Sat, Jul 10, 2010 at 12:59 PM, Shuja Rehman <[EMAIL PROTECTED] > >wrote: > > > Hi Alex > > > > Yeah, I am running a job on cluster of 2 machines and using Cloudera > > distribution of hadoop. and here is the output of this command. > > > > root 5277 5238 3 12:51 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java > > -Xmx1023m -Dhadoop.log.dir=/usr/lib /hadoop-0.20/logs > > -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop-0.20 > > -Dhadoop.id.str= -Dhado op.root.logger=INFO,console > > -Dhadoop.policy.file=hadoop-policy.xml -classpath > > /usr/lib/hadoop-0.20/conf:/usr/ Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spacePatrick Angeles 2010-07-12, 13:12
Shuja,
Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only used for child JVMs that get forked by the TaskTracker. You are using Hadoop streaming, which means the TaskTracker is forking a JVM for streaming, which is then forking a shell process that runs your groovy code (in another JVM). I'm not much of a groovy expert, but if there's a way you can wrap your code around the MapReduce API that would work best. Otherwise, you can just pass the heapsize in '-mapper' argument. Regards, - Patrick On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > Hi Alex, > > I have update the java to latest available version on all machines in the > cluster and now i run the job by adding this line > > -D mapred.child.ulimit=3145728 \ > > but still same error. Here is the output of this job. > > > root 7845 5674 3 01:24 pts/1 00:00:00 /usr/jdk1.6.0_03/bin/java > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20 > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole > -Dhadoop.policy.file=hadoop-policy.xml -classpath /usr/lib/hadoop-0.20/con > > f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo > > p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado > > op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u > > sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com > > mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u > > sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1 > > .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo > > p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1. > > 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson- > > core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li > > b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run > > time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l > > ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado > > op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop > > -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0. > > 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/ > > hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro- > > 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0. > > 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr > > /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar > org.apache.hadoop.util.RunJar > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728 > -inputformat StreamIn putFormat -inputreader > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w > 3.org/TR/REC-xml">,end=</mdc> > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531 > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1 > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper > /home/ftpuser1/Nodemapper5.groovy -re ducer > org.apache.hadoop.mapred.lib.IdentityReducer -file /home/ftpuser1/Nodemapp > er5.groovy > root 7930 7632 0 01:24 pts/2 00:00:00 grep Nodemapper5.groovy > > > Any clue? > Thanks > > On Sun, Jul 11, 2010 at 3:44 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > > > Hi Shuja, > > > > First, thank you for using CDH3. Can you also check what m* > > apred.child.ulimit* you are using? Try adding "* > > -D mapred.child.ulimit=3145728*" to the command line. > > > > I would also recommend to upgrade java to JDK 1.6 update 8 at a minimum,
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-12, 13:29
Hi Patrick,
Thanks for explanation. I have supply the heapsize in mapper in the following way -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \ but still same error. Any other idea? Thanks On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <[EMAIL PROTECTED]>wrote: > Shuja, > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only > used > for child JVMs that get forked by the TaskTracker. You are using Hadoop > streaming, which means the TaskTracker is forking a JVM for streaming, > which > is then forking a shell process that runs your groovy code (in another > JVM). > > I'm not much of a groovy expert, but if there's a way you can wrap your > code > around the MapReduce API that would work best. Otherwise, you can just pass > the heapsize in '-mapper' argument. > > Regards, > > - Patrick > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <[EMAIL PROTECTED]> > wrote: > > > Hi Alex, > > > > I have update the java to latest available version on all machines in the > > cluster and now i run the job by adding this line > > > > -D mapred.child.ulimit=3145728 \ > > > > but still same error. Here is the output of this job. > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00 /usr/jdk1.6.0_03/bin/java > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs > > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20 > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole > > -Dhadoop.policy.file=hadoop-policy.xml -classpath > /usr/lib/hadoop-0.20/con > > > > > f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo > > > > > p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado > > > > > op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u > > > > > sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com > > > > > mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u > > > > > sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1 > > > > > .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo > > > > > p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1. > > > > > 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson- > > > > > core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li > > > > > b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run > > > > > time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l > > > > > ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado > > > > > op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop > > > > > -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0. > > > > > 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/ > > > > > hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro- > > > > > 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0. > > > > > 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr > > > > > /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja > > r:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar > > org.apache.hadoop.util.RunJar > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar -D > > mapre d.child.java.opts=-Xmx2000M -D mapred.child.ulimit=3145728 > > -inputformat StreamIn putFormat -inputreader > > StreamXmlRecordReader,begin=<mdc xmlns:HTML="http://www.w > > 3.org/TR/REC-xml">,end=</mdc> > > -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531 > > .0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 -jobconf mapred.map.tasks=1 > > -jobconf m apred.reduce.tasks=0 -output RNC14 -mapper > > /home/ftpuser1/Nodemapper5.groovy -re ducer > > org.apache.hadoop.mapred.lib.IdentityReducer -file Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spaceAlex Kozlov 2010-07-12, 15:51
Hi Shuja,
I think you need to enclose the invocation string in quotes. Try: -mapper "/home/ftpuser1/Nodemapper5.groovy Xmx2000m" Also, it would be nice to see how exactly the groovy is invoked. Is groovy started and them gives you OOM or is OOM error during the start? Can you see the new process with "ps -aef"? Can you run groovy in local mode? Try "-jt local" option. Thanks, Alex K On Mon, Jul 12, 2010 at 6:29 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > Hi Patrick, > Thanks for explanation. I have supply the heapsize in mapper in the > following way > > -mapper /home/ftpuser1/Nodemapper5.groovy Xmx2000m \ > > but still same error. Any other idea? > Thanks > > On Mon, Jul 12, 2010 at 6:12 PM, Patrick Angeles <[EMAIL PROTECTED] > >wrote: > > > Shuja, > > > > Those settings (mapred.child.jvm.opts and mapred.child.ulimit) are only > > used > > for child JVMs that get forked by the TaskTracker. You are using Hadoop > > streaming, which means the TaskTracker is forking a JVM for streaming, > > which > > is then forking a shell process that runs your groovy code (in another > > JVM). > > > > I'm not much of a groovy expert, but if there's a way you can wrap your > > code > > around the MapReduce API that would work best. Otherwise, you can just > pass > > the heapsize in '-mapper' argument. > > > > Regards, > > > > - Patrick > > > > On Mon, Jul 12, 2010 at 4:32 AM, Shuja Rehman <[EMAIL PROTECTED]> > > wrote: > > > > > Hi Alex, > > > > > > I have update the java to latest available version on all machines in > the > > > cluster and now i run the job by adding this line > > > > > > -D mapred.child.ulimit=3145728 \ > > > > > > but still same error. Here is the output of this job. > > > > > > > > > root 7845 5674 3 01:24 pts/1 00:00:00 > /usr/jdk1.6.0_03/bin/java > > > -Xmx10 23m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs > > > -Dhadoop.log.file=hadoop.log -Dha doop.home.dir=/usr/lib/hadoop-0.20 > > > -Dhadoop.id.str= -Dhadoop.root.logger=INFO,co nsole > > > -Dhadoop.policy.file=hadoop-policy.xml -classpath > > /usr/lib/hadoop-0.20/con > > > > > > > > > f:/usr/jdk1.6.0_03/lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoo > > > > > > > > > p-core-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hado > > > > > > > > > op-0.20/lib/commons-codec-1.3.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/u > > > > > > > > > sr/lib/hadoop-0.20/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop-0.20/lib/com > > > > > > > > > mons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/u > > > > > > > > > sr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1 > > > > > > > > > .jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2+320.jar:/usr/lib/hadoo > > > > > > > > > p-0.20/lib/hadoop-scribe-log4j-0.20.2+320.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1. > > > > > > > > > 8.0.10.jar:/usr/lib/hadoop-0.20/lib/hsqldb.jar:/usr/lib/hadoop-0.20/lib/jackson- > > > > > > > > > core-asl-1.0.1.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.0.1.jar:/usr/li > > > > > > > > > b/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-run > > > > > > > > > time-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/l > > > > > > > > > ib/jetty-6.1.14.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.14.jar:/usr/lib/hado > > > > > > > > > op-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop > > > > > > > > > -0.20/lib/libfb303.jar:/usr/lib/hadoop-0.20/lib/libthrift.jar:/usr/lib/hadoop-0. > > > > > > > > > 20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/ > > > > > > > > > hadoop-0.20/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop-0.20/lib/oro- > > > > > > > > > 2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0. > > > > > > > > > 20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr > > > > > > > > > /lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.ja
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-12, 19:20
Hi Alex
I have tried with using quotes and also with -jt local but same heap error. and here is the output of ps -aef UID PID PPID C STIME TTY TIME CMD root 1 0 0 04:37 ? 00:00:00 init [3] root 2 1 0 04:37 ? 00:00:00 [migration/0] root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0] root 4 1 0 04:37 ? 00:00:00 [watchdog/0] root 5 1 0 04:37 ? 00:00:00 [events/0] root 6 1 0 04:37 ? 00:00:00 [khelper] root 7 1 0 04:37 ? 00:00:00 [kthread] root 9 7 0 04:37 ? 00:00:00 [xenwatch] root 10 7 0 04:37 ? 00:00:00 [xenbus] root 17 7 0 04:37 ? 00:00:00 [kblockd/0] root 18 7 0 04:37 ? 00:00:00 [cqueue/0] root 22 7 0 04:37 ? 00:00:00 [khubd] root 24 7 0 04:37 ? 00:00:00 [kseriod] root 84 7 0 04:37 ? 00:00:00 [khungtaskd] root 85 7 0 04:37 ? 00:00:00 [pdflush] root 86 7 0 04:37 ? 00:00:00 [pdflush] root 87 7 0 04:37 ? 00:00:00 [kswapd0] root 88 7 0 04:37 ? 00:00:00 [aio/0] root 229 7 0 04:37 ? 00:00:00 [kpsmoused] root 248 7 0 04:37 ? 00:00:00 [kstriped] root 257 7 0 04:37 ? 00:00:00 [kjournald] root 279 7 0 04:37 ? 00:00:00 [kauditd] root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d root 634 7 0 04:37 ? 00:00:00 [kmpathd/0] root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd] root 660 7 0 04:37 ? 00:00:00 [kjournald] root 662 7 0 04:37 ? 00:00:00 [kjournald] root 1032 1 0 04:38 ? 00:00:00 auditd root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0 root 1052 1 0 04:38 ? 00:00:00 klogd -x root 1090 7 0 04:38 ? 00:00:00 [rpciod/0] root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon --system root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd root 1210 1 0 04:38 ? 00:00:00 [krfcommd] root 1244 1 0 04:38 ? 00:00:00 pcscd root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd --server root 1295 1 0 04:38 ? 00:00:00 automount root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive -pidfile /var/run/xinetd.pid root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd /etc/vsftpd/vsftpd.conf root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting connections smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue runner@01:00:00 for /var/spool/clientmqueue root 1379 1 0 04:38 ? 00:00:00 gpm -m /dev/input/mice -t exps2 root 1410 1 0 04:38 ? 00:00:00 crond xfs 1450 1 0 04:38 ? 00:00:00 xfs -droppriv -daemon root 1482 1 0 04:38 ? 00:00:00 /usr/sbin/atd 68 1508 1 0 04:38 ? 00:00:00 hald root 1509 1508 0 04:38 ? 00:00:00 hald-runner root 1533 1 0 04:38 ? 00:00:00 /usr/sbin/smartd -q never root 1536 1 0 04:38 xvc0 00:00:00 /sbin/agetty xvc0 9600 vt100-nav root 1537 1 0 04:38 ? 00:00:00 /usr/bin/python -tt /usr/sbin/yum-updatesd root 1539 1 0 04:38 ? 00:00:00 /usr/libexec/gam_server root 21022 1314 0 11:27 ? 00:00:00 sshd: root@pts/0 root 21024 21022 0 11:27 pts/0 00:00:00 -bash root 21103 1314 0 11:28 ? 00:00:00 sshd: root@pts/1 root 21105 21103 0 11:28 pts/1 00:00:00 -bash root 21992 1314 0 11:47 ? 00:00:00 sshd: root@pts/2 root 21994 21992 0 11:47 pts/2 00:00:00 -bash root 22433 1314 0 11:49 ? 00:00:00 sshd: root@pts/3 root 22437 22433 0 11:49 pts/3 00:00:00 -bash hadoop 24808 1 0 12:01 ? 00:00:02 /usr/jdk1.6.0_03/bin/java -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.lo hadoop 24893 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.lo hadoop 24988 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.lo hadoop 25085 1 0 12:01 ? 00:00:00 /usr/jdk1.6.0_03/bin/java -Xmx2001m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote -Dhadoop.lo hadoop 25175 1 0 12:01 ? 00:00:01 /usr/jdk1.6.0_03/bin/java -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/bin/../logs -Dhadoop.log.file=hadoo root 25925 21994 1 12:06 pts/2 00:00:00 /usr/jdk1.6.0_03/bin/java -Xmx2001m -Dhadoop.log.dir=/usr/lib/hadoop-0.20/logs -Dhadoop.log.file=hadoop.log - hadoop 26120 25175 14 12:06 ? 00:00:01 /usr/jdk1.6.0_03/jre/bin/java -Djava.library.path=/usr/jdk1.6.0_03/jre/lib/i386/client:/usr/jdk1.6.0_03/jre/l hadoop 26162 26120 89 12:06 ? 00:00:05 /usr/jdk1.6.0_03/bin/java -classpath /usr/local/groovy/lib/groovy-1.7.3.jar -Dscript.name=/usr/local/groovy/b root 26185 22437 0 12:07 pts/3 00:00:00 ps -aef *The command which i am executing is * hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \ -D mapred.child.java.opts=-Xmx1024m \ -inputformat StreamInputFormat \ -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" http://www.w3.org/TR/REC-xml">,end=</mdc>" \ -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 \ -jobconf mapred.map.tasks=1 \ -jobconf mapred.reduce.tasks=0 \ -output RNC25 \ -mapper "/home/ftpuser1/Nod
-
Re: java.lang.OutOfMemoryError: Java heap spaceAlex Kozlov 2010-07-12, 20:01
Hi Shuja,
Java listens to the last xmx, so if you have multiple "-Xmx ..." on the command line, the last is valid. Unfortunately you have truncated command lines. Can you show us the full command line, particularly for the process 26162? This seems to be causing problems. If you are running your cluster on 2 nodes, it may be that the task was scheduled on the second node. Did you run "ps -aef" on the second node as well? You can see the task assignment in the JT web-UI ( http://jt-name:50030, drill down to tasks). I suggest you first debug your program in the local mode first, however (use "*-jt local*" option). Did you try the "*-D mapred.child.ulimit=3145728*" option? I do not see it on the command line. Alex K On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <[EMAIL PROTECTED]>wrote: > Hi Alex > > I have tried with using quotes and also with -jt local but same heap > error. > and here is the output of ps -aef > > UID PID PPID C STIME TTY TIME CMD > root 1 0 0 04:37 ? 00:00:00 init [3] > root 2 1 0 04:37 ? 00:00:00 [migration/0] > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0] > root 4 1 0 04:37 ? 00:00:00 [watchdog/0] > root 5 1 0 04:37 ? 00:00:00 [events/0] > root 6 1 0 04:37 ? 00:00:00 [khelper] > root 7 1 0 04:37 ? 00:00:00 [kthread] > root 9 7 0 04:37 ? 00:00:00 [xenwatch] > root 10 7 0 04:37 ? 00:00:00 [xenbus] > root 17 7 0 04:37 ? 00:00:00 [kblockd/0] > root 18 7 0 04:37 ? 00:00:00 [cqueue/0] > root 22 7 0 04:37 ? 00:00:00 [khubd] > root 24 7 0 04:37 ? 00:00:00 [kseriod] > root 84 7 0 04:37 ? 00:00:00 [khungtaskd] > root 85 7 0 04:37 ? 00:00:00 [pdflush] > root 86 7 0 04:37 ? 00:00:00 [pdflush] > root 87 7 0 04:37 ? 00:00:00 [kswapd0] > root 88 7 0 04:37 ? 00:00:00 [aio/0] > root 229 7 0 04:37 ? 00:00:00 [kpsmoused] > root 248 7 0 04:37 ? 00:00:00 [kstriped] > root 257 7 0 04:37 ? 00:00:00 [kjournald] > root 279 7 0 04:37 ? 00:00:00 [kauditd] > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0] > root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd] > root 660 7 0 04:37 ? 00:00:00 [kjournald] > root 662 7 0 04:37 ? 00:00:00 [kjournald] > root 1032 1 0 04:38 ? 00:00:00 auditd > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd > root 1049 1 0 04:38 ? 00:00:00 syslogd -m 0 > root 1052 1 0 04:38 ? 00:00:00 klogd -x > root 1090 7 0 04:38 ? 00:00:00 [rpciod/0] > root 1158 1 0 04:38 ? 00:00:00 rpc.idmapd > dbus 1171 1 0 04:38 ? 00:00:00 dbus-daemon --system > root 1184 1 0 04:38 ? 00:00:00 /usr/sbin/hcid > root 1190 1 0 04:38 ? 00:00:00 /usr/sbin/sdpd > root 1210 1 0 04:38 ? 00:00:00 [krfcommd] > root 1244 1 0 04:38 ? 00:00:00 pcscd > root 1264 1 0 04:38 ? 00:00:00 /usr/bin/hidd --server > root 1295 1 0 04:38 ? 00:00:00 automount > root 1314 1 0 04:38 ? 00:00:00 /usr/sbin/sshd > root 1326 1 0 04:38 ? 00:00:00 xinetd -stayalive -pidfile > /var/run/xinetd.pid > root 1337 1 0 04:38 ? 00:00:00 /usr/sbin/vsftpd > /etc/vsftpd/vsftpd.conf > root 1354 1 0 04:38 ? 00:00:00 sendmail: accepting > connections > smmsp 1362 1 0 04:38 ? 00:00:00 sendmail: Queue runner@01 > :00:00 > for /var/spool/clientmqueue > root 1379 1 0 04:38 ? 00:00:00 gpm -m /dev/input/mice -t
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-12, 20:24
Hi Alex, I am using putty to connect to servers. and this is almost my
maximum screen output which i sent. putty is not allowed me to increase the size of terminal. is there any other way that i get the complete output of ps-aef? Now i run the following command and thnx God, it did not fails and produce the desired output. hadoop jar /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \ -D mapred.child.java.opts=-Xmx1024m \ -D mapred.child.ulimit=3145728 \ -jt local \ -inputformat StreamInputFormat \ -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" http://www.w3.org/TR/REC-xml">,end=</mdc>" \ -input /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 \ -jobconf mapred.map.tasks=1 \ -jobconf mapred.reduce.tasks=0 \ -output RNC32 \ -mapper /home/ftpuser1/Nodemapper5.groovy \ -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ -file /home/ftpuser1/Nodemapper5.groovy but when i omit the -jt local, it produces the same error. Thanks Alex for helping Regards Shuja On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > Hi Shuja, > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on the > command line, the last is valid. Unfortunately you have truncated command > lines. Can you show us the full command line, particularly for the process > 26162? This seems to be causing problems. > > If you are running your cluster on 2 nodes, it may be that the task was > scheduled on the second node. Did you run "ps -aef" on the second node as > well? You can see the task assignment in the JT web-UI ( > http://jt-name:50030, drill down to tasks). > > I suggest you first debug your program in the local mode first, however > (use > "*-jt local*" option). Did you try the "*-D mapred.child.ulimit=3145728*" > option? I do not see it on the command line. > > Alex K > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <[EMAIL PROTECTED] > >wrote: > > > Hi Alex > > > > I have tried with using quotes and also with -jt local but same heap > > error. > > and here is the output of ps -aef > > > > UID PID PPID C STIME TTY TIME CMD > > root 1 0 0 04:37 ? 00:00:00 init [3] > > root 2 1 0 04:37 ? 00:00:00 [migration/0] > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0] > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0] > > root 5 1 0 04:37 ? 00:00:00 [events/0] > > root 6 1 0 04:37 ? 00:00:00 [khelper] > > root 7 1 0 04:37 ? 00:00:00 [kthread] > > root 9 7 0 04:37 ? 00:00:00 [xenwatch] > > root 10 7 0 04:37 ? 00:00:00 [xenbus] > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0] > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0] > > root 22 7 0 04:37 ? 00:00:00 [khubd] > > root 24 7 0 04:37 ? 00:00:00 [kseriod] > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd] > > root 85 7 0 04:37 ? 00:00:00 [pdflush] > > root 86 7 0 04:37 ? 00:00:00 [pdflush] > > root 87 7 0 04:37 ? 00:00:00 [kswapd0] > > root 88 7 0 04:37 ? 00:00:00 [aio/0] > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused] > > root 248 7 0 04:37 ? 00:00:00 [kstriped] > > root 257 7 0 04:37 ? 00:00:00 [kjournald] > > root 279 7 0 04:37 ? 00:00:00 [kauditd] > > root 307 1 0 04:37 ? 00:00:00 /sbin/udevd -d > > root 634 7 0 04:37 ? 00:00:00 [kmpathd/0] > > root 635 7 0 04:37 ? 00:00:00 [kmpath_handlerd] > > root 660 7 0 04:37 ? 00:00:00 [kjournald] > > root 662 7 0 04:37 ? 00:00:00 [kjournald] > > root 1032 1 0 04:38 ? 00:00:00 auditd > > root 1034 1032 0 04:38 ? 00:00:00 /sbin/audispd Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spaceAlex Kozlov 2010-07-12, 20:34
Hmm. It means your options are not propagated to the nodes. Can you put *
mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers? I was under impression that the below should be enough though. Glad you got it working in local mode. -- Alex K On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > Hi Alex, I am using putty to connect to servers. and this is almost my > maximum screen output which i sent. putty is not allowed me to increase the > size of terminal. is there any other way that i get the complete output of > ps-aef? > > Now i run the following command and thnx God, it did not fails and produce > the desired output. > > hadoop jar > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \ > -D mapred.child.java.opts=-Xmx1024m \ > -D mapred.child.ulimit=3145728 \ > -jt local \ > -inputformat StreamInputFormat \ > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" > \ > -input > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 > \ > -jobconf mapred.map.tasks=1 \ > -jobconf mapred.reduce.tasks=0 \ > -output RNC32 \ > -mapper /home/ftpuser1/Nodemapper5.groovy \ > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ > -file /home/ftpuser1/Nodemapper5.groovy > > > but when i omit the -jt local, it produces the same error. > Thanks Alex for helping > Regards > Shuja > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > > > Hi Shuja, > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on the > > command line, the last is valid. Unfortunately you have truncated > command > > lines. Can you show us the full command line, particularly for the > process > > 26162? This seems to be causing problems. > > > > If you are running your cluster on 2 nodes, it may be that the task was > > scheduled on the second node. Did you run "ps -aef" on the second node > as > > well? You can see the task assignment in the JT web-UI ( > > http://jt-name:50030, drill down to tasks). > > > > I suggest you first debug your program in the local mode first, however > > (use > > "*-jt local*" option). Did you try the "*-D > mapred.child.ulimit=3145728*" > > option? I do not see it on the command line. > > > > Alex K > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <[EMAIL PROTECTED] > > >wrote: > > > > > Hi Alex > > > > > > I have tried with using quotes and also with -jt local but same heap > > > error. > > > and here is the output of ps -aef > > > > > > UID PID PPID C STIME TTY TIME CMD > > > root 1 0 0 04:37 ? 00:00:00 init [3] > > > root 2 1 0 04:37 ? 00:00:00 [migration/0] > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0] > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0] > > > root 5 1 0 04:37 ? 00:00:00 [events/0] > > > root 6 1 0 04:37 ? 00:00:00 [khelper] > > > root 7 1 0 04:37 ? 00:00:00 [kthread] > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch] > > > root 10 7 0 04:37 ? 00:00:00 [xenbus] > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0] > > > root 18 7 0 04:37 ? 00:00:00 [cqueue/0] > > > root 22 7 0 04:37 ? 00:00:00 [khubd] > > > root 24 7 0 04:37 ? 00:00:00 [kseriod] > > > root 84 7 0 04:37 ? 00:00:00 [khungtaskd] > > > root 85 7 0 04:37 ? 00:00:00 [pdflush] > > > root 86 7 0 04:37 ? 00:00:00 [pdflush] > > > root 87 7 0 04:37 ? 00:00:00 [kswapd0] > > > root 88 7 0 04:37 ? 00:00:00 [aio/0] > > > root 229 7 0 04:37 ? 00:00:00 [kpsmoused] > > > root 248 7 0 04:37 ? 00:00:00 [kstriped] > > > root 257 7 0 04:37 ? 00:00:00 [kjournald]
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-12, 21:08
Hi
I have added following line to my master node mapred-site.xml file <property> <name>mapred.child.ulimit</name> <value>3145728</value> </property> and run the job again, and wow..., the jobs get completed in 4th attempt. I checked the at 50030. Hadoop runs job 3 times on master server and it fails but when it run on 2nd node, it succeeded and produce the desired result. Why it failed on master? Thanks Shuja On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > Hmm. It means your options are not propagated to the nodes. Can you put * > mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers? > I > was under impression that the below should be enough though. Glad you got > it working in local mode. -- Alex K > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <[EMAIL PROTECTED]> > wrote: > > > Hi Alex, I am using putty to connect to servers. and this is almost my > > maximum screen output which i sent. putty is not allowed me to increase > the > > size of terminal. is there any other way that i get the complete output > of > > ps-aef? > > > > Now i run the following command and thnx God, it did not fails and > produce > > the desired output. > > > > hadoop jar > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar \ > > -D mapred.child.java.opts=-Xmx1024m \ > > -D mapred.child.ulimit=3145728 \ > > -jt local \ > > -inputformat StreamInputFormat \ > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" > > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C> < > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" > > \ > > -input > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 > > \ > > -jobconf mapred.map.tasks=1 \ > > -jobconf mapred.reduce.tasks=0 \ > > -output RNC32 \ > > -mapper /home/ftpuser1/Nodemapper5.groovy \ > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ > > -file /home/ftpuser1/Nodemapper5.groovy > > > > > > but when i omit the -jt local, it produces the same error. > > Thanks Alex for helping > > Regards > > Shuja > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <[EMAIL PROTECTED]> > wrote: > > > > > Hi Shuja, > > > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on the > > > command line, the last is valid. Unfortunately you have truncated > > command > > > lines. Can you show us the full command line, particularly for the > > process > > > 26162? This seems to be causing problems. > > > > > > If you are running your cluster on 2 nodes, it may be that the task was > > > scheduled on the second node. Did you run "ps -aef" on the second node > > as > > > well? You can see the task assignment in the JT web-UI ( > > > http://jt-name:50030, drill down to tasks). > > > > > > I suggest you first debug your program in the local mode first, however > > > (use > > > "*-jt local*" option). Did you try the "*-D > > mapred.child.ulimit=3145728*" > > > option? I do not see it on the command line. > > > > > > Alex K > > > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman <[EMAIL PROTECTED] > > > >wrote: > > > > > > > Hi Alex > > > > > > > > I have tried with using quotes and also with -jt local but same heap > > > > error. > > > > and here is the output of ps -aef > > > > > > > > UID PID PPID C STIME TTY TIME CMD > > > > root 1 0 0 04:37 ? 00:00:00 init [3] > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0] > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0] > > > > root 4 1 0 04:37 ? 00:00:00 [watchdog/0] > > > > root 5 1 0 04:37 ? 00:00:00 [events/0] > > > > root 6 1 0 04:37 ? 00:00:00 [khelper] > > > > root 7 1 0 04:37 ? 00:00:00 [kthread] > > > > root 9 7 0 04:37 ? 00:00:00 [xenwatch] > > > > root 10 7 0 04:37 ? 00:00:00 [xenbus] > > > > root 17 7 0 04:37 ? 00:00:00 [kblockd/0] Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spaceAlex Kozlov 2010-07-12, 21:57
Maybe you do not have enough available memory on master? What is the output
of "*free*" on both nodes? -- Alex K On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > Hi > I have added following line to my master node mapred-site.xml file > > <property> > <name>mapred.child.ulimit</name> > <value>3145728</value> > </property> > > and run the job again, and wow..., the jobs get completed in 4th attempt. I > checked the at 50030. Hadoop runs job 3 times on master server and it fails > but when it run on 2nd node, it succeeded and produce the desired result. > Why it failed on master? > Thanks > Shuja > > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > > > Hmm. It means your options are not propagated to the nodes. Can you put > * > > mapred.child.ulimit* in the mapred-siet.xml and restart the tasktrackers? > > I > > was under impression that the below should be enough though. Glad you > got > > it working in local mode. -- Alex K > > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <[EMAIL PROTECTED]> > > wrote: > > > > > Hi Alex, I am using putty to connect to servers. and this is almost my > > > maximum screen output which i sent. putty is not allowed me to increase > > the > > > size of terminal. is there any other way that i get the complete output > > of > > > ps-aef? > > > > > > Now i run the following command and thnx God, it did not fails and > > produce > > > the desired output. > > > > > > hadoop jar > > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar > \ > > > -D mapred.child.java.opts=-Xmx1024m \ > > > -D mapred.child.ulimit=3145728 \ > > > -jt local \ > > > -inputformat StreamInputFormat \ > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" > > > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C> < > http://www.w3.org/TR/REC-xml%5C> < > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" > > > \ > > > -input > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 > > > \ > > > -jobconf mapred.map.tasks=1 \ > > > -jobconf mapred.reduce.tasks=0 \ > > > -output RNC32 \ > > > -mapper /home/ftpuser1/Nodemapper5.groovy \ > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ > > > -file /home/ftpuser1/Nodemapper5.groovy > > > > > > > > > but when i omit the -jt local, it produces the same error. > > > Thanks Alex for helping > > > Regards > > > Shuja > > > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <[EMAIL PROTECTED]> > > wrote: > > > > > > > Hi Shuja, > > > > > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on > the > > > > command line, the last is valid. Unfortunately you have truncated > > > command > > > > lines. Can you show us the full command line, particularly for the > > > process > > > > 26162? This seems to be causing problems. > > > > > > > > If you are running your cluster on 2 nodes, it may be that the task > was > > > > scheduled on the second node. Did you run "ps -aef" on the second > node > > > as > > > > well? You can see the task assignment in the JT web-UI ( > > > > http://jt-name:50030, drill down to tasks). > > > > > > > > I suggest you first debug your program in the local mode first, > however > > > > (use > > > > "*-jt local*" option). Did you try the "*-D > > > mapred.child.ulimit=3145728*" > > > > option? I do not see it on the command line. > > > > > > > > Alex K > > > > > > > > On Mon, Jul 12, 2010 at 12:20 PM, Shuja Rehman < > [EMAIL PROTECTED] > > > > >wrote: > > > > > > > > > Hi Alex > > > > > > > > > > I have tried with using quotes and also with -jt local but same > heap > > > > > error. > > > > > and here is the output of ps -aef > > > > > > > > > > UID PID PPID C STIME TTY TIME CMD > > > > > root 1 0 0 04:37 ? 00:00:00 init [3] > > > > > root 2 1 0 04:37 ? 00:00:00 [migration/0] > > > > > root 3 1 0 04:37 ? 00:00:00 [ksoftirqd/0]
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-12, 22:06
*Master Node output:*
total used free shared buffers cached Mem: 2097328 515576 1581752 0 56060 254760 -/+ buffers/cache: 204756 1892572 Swap: 522104 0 522104 *Slave Node output:* total used free shared buffers cached Mem: 1048752 860684 188068 0 148388 570948 -/+ buffers/cache: 141348 907404 Swap: 522104 40 522064 it seems that on server there is more memory free. On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > Maybe you do not have enough available memory on master? What is the > output > of "*free*" on both nodes? -- Alex K > > On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <[EMAIL PROTECTED]> > wrote: > > > Hi > > I have added following line to my master node mapred-site.xml file > > > > <property> > > <name>mapred.child.ulimit</name> > > <value>3145728</value> > > </property> > > > > and run the job again, and wow..., the jobs get completed in 4th attempt. > I > > checked the at 50030. Hadoop runs job 3 times on master server and it > fails > > but when it run on 2nd node, it succeeded and produce the desired result. > > Why it failed on master? > > Thanks > > Shuja > > > > > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <[EMAIL PROTECTED]> > wrote: > > > > > Hmm. It means your options are not propagated to the nodes. Can you > put > > * > > > mapred.child.ulimit* in the mapred-siet.xml and restart the > tasktrackers? > > > I > > > was under impression that the below should be enough though. Glad you > > got > > > it working in local mode. -- Alex K > > > > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <[EMAIL PROTECTED]> > > > wrote: > > > > > > > Hi Alex, I am using putty to connect to servers. and this is almost > my > > > > maximum screen output which i sent. putty is not allowed me to > increase > > > the > > > > size of terminal. is there any other way that i get the complete > output > > > of > > > > ps-aef? > > > > > > > > Now i run the following command and thnx God, it did not fails and > > > produce > > > > the desired output. > > > > > > > > hadoop jar > > > > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar > > \ > > > > -D mapred.child.java.opts=-Xmx1024m \ > > > > -D mapred.child.ulimit=3145728 \ > > > > -jt local \ > > > > -inputformat StreamInputFormat \ > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" > > > > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C> < > http://www.w3.org/TR/REC-xml%5C> < > > http://www.w3.org/TR/REC-xml%5C> < > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" > > > > \ > > > > -input > > > > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 > > > > \ > > > > -jobconf mapred.map.tasks=1 \ > > > > -jobconf mapred.reduce.tasks=0 \ > > > > -output RNC32 \ > > > > -mapper /home/ftpuser1/Nodemapper5.groovy \ > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ > > > > -file /home/ftpuser1/Nodemapper5.groovy > > > > > > > > > > > > but when i omit the -jt local, it produces the same error. > > > > Thanks Alex for helping > > > > Regards > > > > Shuja > > > > > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > Hi Shuja, > > > > > > > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." on > > the > > > > > command line, the last is valid. Unfortunately you have truncated > > > > command > > > > > lines. Can you show us the full command line, particularly for the > > > > process > > > > > 26162? This seems to be causing problems. > > > > > > > > > > If you are running your cluster on 2 nodes, it may be that the task > > was > > > > > scheduled on the second node. Did you run "ps -aef" on the second > > node > > > > as > > > > > well? You can see the task assignment in the JT web-UI ( Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-12, 23:53
Alex, any guess why it fails on server while it has more free memory than
slave. On Tue, Jul 13, 2010 at 3:06 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > *Master Node output:* > > total used free shared buffers cached > Mem: 2097328 515576 1581752 0 56060 254760 > -/+ buffers/cache: 204756 1892572 > Swap: 522104 0 522104 > > *Slave Node output:* > total used free shared buffers cached > Mem: 1048752 860684 188068 0 148388 570948 > -/+ buffers/cache: 141348 907404 > Swap: 522104 40 522064 > > it seems that on server there is more memory free. > > > > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > >> Maybe you do not have enough available memory on master? What is the >> output >> of "*free*" on both nodes? -- Alex K >> >> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <[EMAIL PROTECTED]> >> wrote: >> >> > Hi >> > I have added following line to my master node mapred-site.xml file >> > >> > <property> >> > <name>mapred.child.ulimit</name> >> > <value>3145728</value> >> > </property> >> > >> > and run the job again, and wow..., the jobs get completed in 4th >> attempt. I >> > checked the at 50030. Hadoop runs job 3 times on master server and it >> fails >> > but when it run on 2nd node, it succeeded and produce the desired >> result. >> > Why it failed on master? >> > Thanks >> > Shuja >> > >> > >> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <[EMAIL PROTECTED]> >> wrote: >> > >> > > Hmm. It means your options are not propagated to the nodes. Can you >> put >> > * >> > > mapred.child.ulimit* in the mapred-siet.xml and restart the >> tasktrackers? >> > > I >> > > was under impression that the below should be enough though. Glad you >> > got >> > > it working in local mode. -- Alex K >> > > >> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <[EMAIL PROTECTED]> >> > > wrote: >> > > >> > > > Hi Alex, I am using putty to connect to servers. and this is almost >> my >> > > > maximum screen output which i sent. putty is not allowed me to >> increase >> > > the >> > > > size of terminal. is there any other way that i get the complete >> output >> > > of >> > > > ps-aef? >> > > > >> > > > Now i run the following command and thnx God, it did not fails and >> > > produce >> > > > the desired output. >> > > > >> > > > hadoop jar >> > > > >> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar >> > \ >> > > > -D mapred.child.java.opts=-Xmx1024m \ >> > > > -D mapred.child.ulimit=3145728 \ >> > > > -jt local \ >> > > > -inputformat StreamInputFormat \ >> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" >> > > > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C> < >> http://www.w3.org/TR/REC-xml%5C> < >> > http://www.w3.org/TR/REC-xml%5C> < >> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" >> > > > \ >> > > > -input >> > > > >> > > > >> > > >> > >> /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 >> > > > \ >> > > > -jobconf mapred.map.tasks=1 \ >> > > > -jobconf mapred.reduce.tasks=0 \ >> > > > -output RNC32 \ >> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \ >> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ >> > > > -file /home/ftpuser1/Nodemapper5.groovy >> > > > >> > > > >> > > > but when i omit the -jt local, it produces the same error. >> > > > Thanks Alex for helping >> > > > Regards >> > > > Shuja >> > > > >> > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <[EMAIL PROTECTED]> >> > > wrote: >> > > > >> > > > > Hi Shuja, >> > > > > >> > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." >> on >> > the >> > > > > command line, the last is valid. Unfortunately you have truncated >> > > > command >> > > > > lines. Can you show us the full command line, particularly for >> the >> > > > process Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445
-
Re: java.lang.OutOfMemoryError: Java heap spaceTed Yu 2010-07-13, 00:49
Normally task tracker isn't run on Name node.
Did you configure otherwise ? On Mon, Jul 12, 2010 at 3:06 PM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > *Master Node output:* > > total used free shared buffers cached > Mem: 2097328 515576 1581752 0 56060 254760 > -/+ buffers/cache: 204756 1892572 > Swap: 522104 0 522104 > > *Slave Node output:* > total used free shared buffers cached > Mem: 1048752 860684 188068 0 148388 570948 > -/+ buffers/cache: 141348 907404 > Swap: 522104 40 522064 > > it seems that on server there is more memory free. > > > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > > > Maybe you do not have enough available memory on master? What is the > > output > > of "*free*" on both nodes? -- Alex K > > > > On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <[EMAIL PROTECTED]> > > wrote: > > > > > Hi > > > I have added following line to my master node mapred-site.xml file > > > > > > <property> > > > <name>mapred.child.ulimit</name> > > > <value>3145728</value> > > > </property> > > > > > > and run the job again, and wow..., the jobs get completed in 4th > attempt. > > I > > > checked the at 50030. Hadoop runs job 3 times on master server and it > > fails > > > but when it run on 2nd node, it succeeded and produce the desired > result. > > > Why it failed on master? > > > Thanks > > > Shuja > > > > > > > > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <[EMAIL PROTECTED]> > > wrote: > > > > > > > Hmm. It means your options are not propagated to the nodes. Can you > > put > > > * > > > > mapred.child.ulimit* in the mapred-siet.xml and restart the > > tasktrackers? > > > > I > > > > was under impression that the below should be enough though. Glad > you > > > got > > > > it working in local mode. -- Alex K > > > > > > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman <[EMAIL PROTECTED] > > > > > > wrote: > > > > > > > > > Hi Alex, I am using putty to connect to servers. and this is almost > > my > > > > > maximum screen output which i sent. putty is not allowed me to > > increase > > > > the > > > > > size of terminal. is there any other way that i get the complete > > output > > > > of > > > > > ps-aef? > > > > > > > > > > Now i run the following command and thnx God, it did not fails and > > > > produce > > > > > the desired output. > > > > > > > > > > hadoop jar > > > > > > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar > > > \ > > > > > -D mapred.child.java.opts=-Xmx1024m \ > > > > > -D mapred.child.ulimit=3145728 \ > > > > > -jt local \ > > > > > -inputformat StreamInputFormat \ > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" > > > > > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C> < > http://www.w3.org/TR/REC-xml%5C> < > > http://www.w3.org/TR/REC-xml%5C> < > > > http://www.w3.org/TR/REC-xml%5C> < > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" > > > > > \ > > > > > -input > > > > > > > > > > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 > > > > > \ > > > > > -jobconf mapred.map.tasks=1 \ > > > > > -jobconf mapred.reduce.tasks=0 \ > > > > > -output RNC32 \ > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy \ > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ > > > > > -file /home/ftpuser1/Nodemapper5.groovy > > > > > > > > > > > > > > > but when i omit the -jt local, it produces the same error. > > > > > Thanks Alex for helping > > > > > Regards > > > > > Shuja > > > > > > > > > > On Tue, Jul 13, 2010 at 1:01 AM, Alex Kozlov <[EMAIL PROTECTED]> > > > > wrote: > > > > > > > > > > > Hi Shuja, > > > > > > > > > > > > Java listens to the last xmx, so if you have multiple "-Xmx ..." > on > > > the > > > > > > command line, the last is valid. Unfortunately you have
-
Re: java.lang.OutOfMemoryError: Java heap spaceAlex Kozlov 2010-07-13, 02:07
Honestly, no idea. I can just suggest running "*hadoop jar
/usr/lib/hadoop-0.20/contrib/* *streaming/hadoop-streaming-0.**20.2+320.jar -jt local -fs local ...*" on both nodes and debug. On Mon, Jul 12, 2010 at 4:53 PM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > Alex, any guess why it fails on server while it has more free memory than > slave. > > On Tue, Jul 13, 2010 at 3:06 AM, Shuja Rehman <[EMAIL PROTECTED]> > wrote: > > > *Master Node output:* > > > > total used free shared buffers cached > > Mem: 2097328 515576 1581752 0 56060 254760 > > -/+ buffers/cache: 204756 1892572 > > Swap: 522104 0 522104 > > > > *Slave Node output:* > > total used free shared buffers cached > > Mem: 1048752 860684 188068 0 148388 570948 > > -/+ buffers/cache: 141348 907404 > > Swap: 522104 40 522064 > > > > it seems that on server there is more memory free. > > > > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <[EMAIL PROTECTED]> > wrote: > > > >> Maybe you do not have enough available memory on master? What is the > >> output > >> of "*free*" on both nodes? -- Alex K > >> > >> On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <[EMAIL PROTECTED]> > >> wrote: > >> > >> > Hi > >> > I have added following line to my master node mapred-site.xml file > >> > > >> > <property> > >> > <name>mapred.child.ulimit</name> > >> > <value>3145728</value> > >> > </property> > >> > > >> > and run the job again, and wow..., the jobs get completed in 4th > >> attempt. I > >> > checked the at 50030. Hadoop runs job 3 times on master server and it > >> fails > >> > but when it run on 2nd node, it succeeded and produce the desired > >> result. > >> > Why it failed on master? > >> > Thanks > >> > Shuja > >> > > >> > > >> > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <[EMAIL PROTECTED]> > >> wrote: > >> > > >> > > Hmm. It means your options are not propagated to the nodes. Can > you > >> put > >> > * > >> > > mapred.child.ulimit* in the mapred-siet.xml and restart the > >> tasktrackers? > >> > > I > >> > > was under impression that the below should be enough though. Glad > you > >> > got > >> > > it working in local mode. -- Alex K > >> > > > >> > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman < > [EMAIL PROTECTED]> > >> > > wrote: > >> > > > >> > > > Hi Alex, I am using putty to connect to servers. and this is > almost > >> my > >> > > > maximum screen output which i sent. putty is not allowed me to > >> increase > >> > > the > >> > > > size of terminal. is there any other way that i get the complete > >> output > >> > > of > >> > > > ps-aef? > >> > > > > >> > > > Now i run the following command and thnx God, it did not fails and > >> > > produce > >> > > > the desired output. > >> > > > > >> > > > hadoop jar > >> > > > > >> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar > >> > \ > >> > > > -D mapred.child.java.opts=-Xmx1024m \ > >> > > > -D mapred.child.ulimit=3145728 \ > >> > > > -jt local \ > >> > > > -inputformat StreamInputFormat \ > >> > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" > >> > > > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C> < > http://www.w3.org/TR/REC-xml%5C> < > >> http://www.w3.org/TR/REC-xml%5C> < > >> > http://www.w3.org/TR/REC-xml%5C> < > >> > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" > >> > > > \ > >> > > > -input > >> > > > > >> > > > > >> > > > >> > > >> > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 > >> > > > \ > >> > > > -jobconf mapred.map.tasks=1 \ > >> > > > -jobconf mapred.reduce.tasks=0 \ > >> > > > -output RNC32 \ > >> > > > -mapper /home/ftpuser1/Nodemapper5.groovy \ > >> > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ > >> > > > -file /home/ftpuser1/Nodemapper5.groovy > >> > > > > >> > > > > >> > > > but when i omit the -jt local, it produces the same error.
-
Re: java.lang.OutOfMemoryError: Java heap spaceShuja Rehman 2010-07-13, 11:37
Hi Ted yu,
As i have cluster of 2 nodes and i have configured task tracker on name node as well to process the files. On Tue, Jul 13, 2010 at 5:49 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > Normally task tracker isn't run on Name node. > Did you configure otherwise ? > > On Mon, Jul 12, 2010 at 3:06 PM, Shuja Rehman <[EMAIL PROTECTED]> > wrote: > > > *Master Node output:* > > > > total used free shared buffers cached > > Mem: 2097328 515576 1581752 0 56060 254760 > > -/+ buffers/cache: 204756 1892572 > > Swap: 522104 0 522104 > > > > *Slave Node output:* > > total used free shared buffers cached > > Mem: 1048752 860684 188068 0 148388 570948 > > -/+ buffers/cache: 141348 907404 > > Swap: 522104 40 522064 > > > > it seems that on server there is more memory free. > > > > > > On Tue, Jul 13, 2010 at 2:57 AM, Alex Kozlov <[EMAIL PROTECTED]> > wrote: > > > > > Maybe you do not have enough available memory on master? What is the > > > output > > > of "*free*" on both nodes? -- Alex K > > > > > > On Mon, Jul 12, 2010 at 2:08 PM, Shuja Rehman <[EMAIL PROTECTED]> > > > wrote: > > > > > > > Hi > > > > I have added following line to my master node mapred-site.xml file > > > > > > > > <property> > > > > <name>mapred.child.ulimit</name> > > > > <value>3145728</value> > > > > </property> > > > > > > > > and run the job again, and wow..., the jobs get completed in 4th > > attempt. > > > I > > > > checked the at 50030. Hadoop runs job 3 times on master server and it > > > fails > > > > but when it run on 2nd node, it succeeded and produce the desired > > result. > > > > Why it failed on master? > > > > Thanks > > > > Shuja > > > > > > > > > > > > On Tue, Jul 13, 2010 at 1:34 AM, Alex Kozlov <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > Hmm. It means your options are not propagated to the nodes. Can > you > > > put > > > > * > > > > > mapred.child.ulimit* in the mapred-siet.xml and restart the > > > tasktrackers? > > > > > I > > > > > was under impression that the below should be enough though. Glad > > you > > > > got > > > > > it working in local mode. -- Alex K > > > > > > > > > > On Mon, Jul 12, 2010 at 1:24 PM, Shuja Rehman < > [EMAIL PROTECTED] > > > > > > > > wrote: > > > > > > > > > > > Hi Alex, I am using putty to connect to servers. and this is > almost > > > my > > > > > > maximum screen output which i sent. putty is not allowed me to > > > increase > > > > > the > > > > > > size of terminal. is there any other way that i get the complete > > > output > > > > > of > > > > > > ps-aef? > > > > > > > > > > > > Now i run the following command and thnx God, it did not fails > and > > > > > produce > > > > > > the desired output. > > > > > > > > > > > > hadoop jar > > > > > > > > > /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2+320.jar > > > > \ > > > > > > -D mapred.child.java.opts=-Xmx1024m \ > > > > > > -D mapred.child.ulimit=3145728 \ > > > > > > -jt local \ > > > > > > -inputformat StreamInputFormat \ > > > > > > -inputreader "StreamXmlRecordReader,begin=<mdc xmlns:HTML=\" > > > > > > http://www.w3.org/TR/REC-xml <http://www.w3.org/TR/REC-xml%5C>< > http://www.w3.org/TR/REC-xml%5C> < > > http://www.w3.org/TR/REC-xml%5C> < > > > http://www.w3.org/TR/REC-xml%5C> < > > > > http://www.w3.org/TR/REC-xml%5C> < > > > > > http://www.w3.org/TR/REC-xml%5C>">,end=</mdc>" > > > > > > \ > > > > > > -input > > > > > > > > > > > > > > > > > > > > > > > > > > > /user/root/RNCDATA/MDFDORKUCRAR02/A20100531.0000-0700-0015-0700_RNCCN-MDFDORKUCRAR02 > > > > > > \ > > > > > > -jobconf mapred.map.tasks=1 \ > > > > > > -jobconf mapred.reduce.tasks=0 \ > > > > > > -output RNC32 \ > > > > > > -mapper /home/ftpuser1/Nodemapper5.groovy \ > > > > > > -reducer org.apache.hadoop.mapred.lib.IdentityReducer \ > > > > > > -file /home/ftpuser1/Nodemapper5.groovy Regards Shuja-ur-Rehman Baig _________________________________ MS CS - School of Science and Engineering Lahore University of Management Sciences (LUMS) Sector U, DHA, Lahore, 54792, Pakistan Cell: +92 3214207445 |