Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: OutOfMemory during Plain Java MapReduce


+
Harsh J 2013-03-08, 10:57
+
Paul Wilkinson 2013-03-08, 12:09
Copy link to this message
-
Re: OutOfMemory during Plain Java MapReduce
"A potential problem could be, that a reduce is going to write files >600MB and our mapred.child.java.opts is set to ~380MB."

Isn't the minimum heap normally 512MB?

Why not just increase your child heap size, assuming you have enough memory on the box...
On Mar 8, 2013, at 4:57 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi,
>
> When you implement code that starts memory-storing value copies for
> every record (even if of just a single key), things are going to break
> in big-data-land. Practically, post-partitioning, the # of values for
> a given key can be huge given the source data, so you cannot hold it
> all in and then write in one go. You'd probably need to write out
> something continuously if you really really want to do this, or use an
> alternative form of key-value storage where updates can be made
> incrementally (Apache HBase is such a store, as one example).
>
> This has been discussed before IIRC, and if the goal were to store the
> outputs onto a file then its better to just directly serialize them
> with a file opened instead of keeping it in a data structure and
> serializing it at the end. The caveats that'd apply if you were to
> open your own file from a task are described at
> http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2BAC8-write-to_hdfs_files_directly_from_map.2BAC8-reduce_tasks.3F.
>
> On Fri, Mar 8, 2013 at 4:35 AM, Christian Schneider
> <[EMAIL PROTECTED]> wrote:
>> I had a look to the stacktrace and it says the problem is at the reducer:
>> userSet.add(iterator.next().toString());
>>
>> Error: Java heap space
>> attempt_201303072200_0016_r_000002_0: WARN : mapreduce.Counters - Group
>> org.apache.hadoop.mapred.Task$Counter is deprecated. Use
>> org.apache.hadoop.mapreduce.TaskCounter instead
>> attempt_201303072200_0016_r_000002_0: WARN :
>> org.apache.hadoop.conf.Configuration - session.id is deprecated. Instead,
>> use dfs.metrics.session-id
>> attempt_201303072200_0016_r_000002_0: WARN :
>> org.apache.hadoop.conf.Configuration - slave.host.name is deprecated.
>> Instead, use dfs.datanode.hostname
>> attempt_201303072200_0016_r_000002_0: FATAL: org.apache.hadoop.mapred.Child
>> - Error running child : java.lang.OutOfMemoryError: Java heap space
>> attempt_201303072200_0016_r_000002_0: at
>> java.util.Arrays.copyOfRange(Arrays.java:3209)
>> attempt_201303072200_0016_r_000002_0: at
>> java.lang.String.<init>(String.java:215)
>> attempt_201303072200_0016_r_000002_0: at
>> java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
>> attempt_201303072200_0016_r_000002_0: at
>> java.nio.CharBuffer.toString(CharBuffer.java:1157)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.io.Text.decode(Text.java:394)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.io.Text.decode(Text.java:371)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.io.Text.toString(Text.java:273)
>> attempt_201303072200_0016_r_000002_0: at
>> com.myCompany.UserToAppReducer.reduce(RankingReducer.java:21)
>> attempt_201303072200_0016_r_000002_0: at
>> com.myCompany.UserToAppReducer.reduce(RankingReducer.java:1)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:164)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:610)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:444)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>> attempt_201303072200_0016_r_000002_0: at
>> java.security.AccessController.doPrivileged(Native Method)
>> attempt_201303072200_0016_r_000002_0: at
>> javax.security.auth.Subject.doAs(Subject.java:396)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>> attempt_201303072200_0016_r_000002_0: at
>> org.apache.hadoop.mapred.Child.main(Child.java:262)
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB