Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Shuffle In Memory OutOfMemoryError

Copy link to this message
RE: Shuffle In Memory OutOfMemoryError
Andy Sautins 2010-03-07, 23:57

  Thanks Ted.  Very helpful.  You are correct that I misunderstood the code at ReduceTask.java:1535.  I missed the fact that it's in a IOException catch block.  My mistake.  That's what I get for being in a rush.

  For what it's worth I did re-run the job with mapred.reduce.parallel.copies set with values from 5 all the way down to 1.  All failed with the same error:

Error: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)

  So from that it does seem like something else might be going on, yes?  I need to do some more research.  

  I appreciate your insights.


-----Original Message-----
From: Ted Yu [mailto:[EMAIL PROTECTED]]
Sent: Sunday, March 07, 2010 3:38 PM
Subject: Re: Shuffle In Memory OutOfMemoryError

My observation is based on this call chain:
MapOutputCopier.run() calling copyOutput() calling getMapOutput() calling

Basically ramManager.canFitInMemory() makes decision without considering the
number of MapOutputCopiers that are running. Thus 1.25 * 0.7 of total heap
may be used in shuffling if default parameters were used.
Of course, you should check the value for mapred.reduce.parallel.copies to
see if it is 5. If it is 4 or lower, my reasoning wouldn't apply.

About ramManager.unreserve() call, ReduceTask.java from hadoop 0.20.2 only
has 2731 lines. So I have to guess the location of the code snippet you
I found this around line 1535:
        } catch (IOException ioe) {
          LOG.info("Failed to shuffle from " +

          // Inform the ram-manager

          // Discard the map-output
          try {
          } catch (IOException ignored) {
            LOG.info("Failed to discard map-output from " +
                     mapOutputLoc.getTaskAttemptId(), ignored);
Please confirm the line number.

If we're looking at the same code, I am afraid I don't see how we can
improve it. First, I assume IOException shouldn't happen that often. Second,
mapOutput.discard() just sets:
          data = null;
for in memory case. Even if we call mapOutput.discard() before
ramManager.unreserve(), we don't know when GC would kick in and make more
memory available.
Of course, given the large number of map outputs in your system, it became
more likely that the root cause from my reasoning made OOME happen sooner.


On Sun, Mar 7, 2010 at 1:03 PM, Andy Sautins <[EMAIL PROTECTED]>wrote:

>   Ted,
>   I'm trying to follow the logic in your mail and I'm not sure I'm
> following.  If you would mind helping me understand I would appreciate it.
>   Looking at the code maxSingleShuffleLimit is only used in determining if
> the copy _can_ fit into memory:
>     boolean canFitInMemory(long requestedSize) {
>        return (requestedSize < Integer.MAX_VALUE &&
>                requestedSize < maxSingleShuffleLimit);
>      }
>    It also looks like the RamManager.reserve should wait until memory is
> available so it should hit a memory limit for that reason.
>    What does seem a little strange to me is the following ( ReduceTask.java
> starting at 2730 ):
>          // Inform the ram-manager
>          ramManager.closeInMemoryFile(mapOutputLength);
>          ramManager.unreserve(mapOutputLength);
>          // Discard the map-output
>          try {
>            mapOutput.discard();
>          } catch (IOException ignored) {