Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Child JVM memory allocation / Usage


Copy link to this message
-
Re: Child JVM memory allocation / Usage
Couple of things to check:

Does your class com.hadoop.publicationMrPOC.Launcher implement the Tool
interface ? You can look at an example at (
http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Source+Code-N110D0).
That's what accepts the -D params on command line. Alternatively, you can
also set the same in the configuration object like this, in your launcher
code:

Configuration conf = new Configuration()

conf.set("mapred.create.symlink", "yes");
conf.set("mapred.cache.files",
"hdfs:///user/hemanty/scripts/copy_dump.sh#copy_dump.sh");
conf.set("mapred.child.java.opts",
  "-Xmx200m -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=./heapdump.hprof
-XX:OnOutOfMemoryError=./copy_dump.sh");
Second, the position of the arguments matters. I think the command should
be

hadoop jar -Dmapred.create.symlink=yes
-Dmapred.cache.files=hdfs:///user/ims-b/dump.sh#dump.sh
-Dmapred.reduce.child.java.opts='-Xmx2048m -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=./myheapdump.hprof -XX:OnOutOfMemoryError=./dump.sh'
com.hadoop.publicationMrPOC.Launcher  Fudan\ Univ

Thanks
Hemanth
On Wed, Mar 27, 2013 at 1:58 PM, nagarjuna kanamarlapudi <
[EMAIL PROTECTED]> wrote:

> Hi Hemanth/Koji,
>
> Seems the above script doesn't work for me.  Can u look into the following
> and suggest what more can I do
>
>
>  hadoop fs -cat /user/ims-b/dump.sh
> #!/bin/sh
> hadoop dfs -put myheapdump.hprof /tmp/myheapdump_ims/${PWD//\//_}.hprof
>
>
> hadoop jar LL.jar com.hadoop.publicationMrPOC.Launcher  Fudan\ Univ
>  -Dmapred.create.symlink=yes
> -Dmapred.cache.files=hdfs:///user/ims-b/dump.sh#dump.sh
> -Dmapred.reduce.child.java.opts='-Xmx2048m -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=./myheapdump.hprof -XX:OnOutOfMemoryError=./dump.sh'
>
>
> I am not able to see the heap dump at  /tmp/myheapdump_ims
>
>
>
> Erorr in the mapper :
>
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> ... 17 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:2734)
> at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
> at java.util.ArrayList.add(ArrayList.java:351)
> at com.hadoop.publicationMrPOC.PublicationMapper.configure(PublicationMapper.java:59)
> ... 22 more
>
>
>
>
>
> On Wed, Mar 27, 2013 at 10:16 AM, Hemanth Yamijala <
> [EMAIL PROTECTED]> wrote:
>
>> Koji,
>>
>> Works beautifully. Thanks a lot. I learnt at least 3 different things
>> with your script today !
>>
>> Hemanth
>>
>>
>> On Tue, Mar 26, 2013 at 9:41 PM, Koji Noguchi <[EMAIL PROTECTED]>wrote:
>>
>>> Create a dump.sh on hdfs.
>>>
>>> $ hadoop dfs -cat /user/knoguchi/dump.sh
>>> #!/bin/sh
>>> hadoop dfs -put myheapdump.hprof
>>> /tmp/myheapdump_knoguchi/${PWD//\//_}.hprof
>>>
>>> Run your job with
>>>
>>> -Dmapred.create.symlink=yes
>>> -Dmapred.cache.files=hdfs:///user/knoguchi/dump.sh#dump.sh
>>> -Dmapred.reduce.child.java.opts='-Xmx2048m
>>> -XX:+HeapDumpOnOutOfMemoryError
>>> -XX:HeapDumpPath=./myheapdump.hprof -XX:OnOutOfMemoryError=./dump.sh'
>>>
>>> This should create the heap dump on hdfs at /tmp/myheapdump_knoguchi.
>>>
>>> Koji
>>>
>>>
>>> On Mar 26, 2013, at 11:53 AM, Hemanth Yamijala wrote:
>>>
>>> > Hi,
>>> >
>>> > I tried to use the -XX:+HeapDumpOnOutOfMemoryError. Unfortunately,
>>> like I suspected, the dump goes to the current work directory of the task
>>> attempt as it executes on the cluster. This directory is cleaned up once
>>> the task is done. There are options to keep failed task files or task files
>>> matching a pattern. However, these are NOT retaining the current working
>>> directory. Hence, there is no option to get this from a cluster AFAIK.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB