Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - How to troubleshoot OutOfMemoryError


Copy link to this message
-
How to troubleshoot OutOfMemoryError
David Parks 2012-12-22, 04:33
I'm pretty consistently seeing a few reduce tasks fail with OutOfMemoryError
(below). It doesn't kill the job, but it slows it down.

 

In my current case the reducer is pretty darn simple, the algorithm
basically does:

1.       Do you have 2 values for this key?

2.       If so, build a json string and emit a NullWritable and Text value.

 

The string buffer I use to build the json is re-used, and I can't see
anywhere in my code that would be taking more than ~50k of memory at any
point in time.

 

But I want to verify, is there a way to get the heap dump and all after this
error? I'm running on AWS MapReduce v1.0.3 of Hadoop.

 

Error: java.lang.OutOfMemoryError: Java heap space

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMe
mory(ReduceTask.java:1711)

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutpu
t(ReduceTask.java:1571)

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(
ReduceTask.java:1412)

        at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceT
ask.java:1344)

 

 

+
Manoj Babu 2012-12-22, 15:53