Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> ReduceTask > ShuffleRamManager : Java Heap memory error


Copy link to this message
-
ReduceTask > ShuffleRamManager : Java Heap memory error

Hi to all,
first many thanks for the quality of the work you are doing : thanks a lot

I am facing a bug with the memory management at shuffle time, I regularly get

Map output copy failure : java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1612)
reading the code in org.apache.hadoop.mapred.ReduceTask.java file

the "ShuffleRamManager" is limiting the maximum of RAM allocation to Integer.MAX_VALUE * maxInMemCopyUse ?

maxSize = (int)(conf.getInt("mapred.job.reduce.total.mem.bytes",
           (int)Math.min(Runtime.getRuntime().maxMemory(), Integer.MAX_VALUE))
         * maxInMemCopyUse);

Why is is so ?
And why is it concatened to an Integer as its raw type is long ?

Does it mean that you can not have a Reduce Task taking advantage of more than 2Gb of memory ?

To explain a little bit my use case,
I am processing some 2700 maps (each working on 128 MB block of data), and when the reduce phase starts, it sometimes stumbles with java heap memory issues.

configuration is : java 1.6.0-27
hadoop 0.20.2
-Xmx1400m
io.sort.mb 400
io.sort.factor 25
io.sort.spill.percent 0.80
mapred.job.shuffle.input.buffer.percent 0.70
ShuffleRamManager: MemoryLimit=913466944, MaxSingleShuffleLimit=228366736

I will decrease
mapred.job.shuffle.input.buffer.percent to limit the errors, but I am not fully confident for the scalability of the process.

Any help would be welcomed

once again, many thanks
Olivier
P.S: sorry if I misunderstood the code, any explanation would be really welcomed

--
 
 
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB