Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Shuffle In Memory OutOfMemoryError


Copy link to this message
-
RE: Shuffle In Memory OutOfMemoryError

   Ted,

   I'm trying to follow the logic in your mail and I'm not sure I'm following.  If you would mind helping me understand I would appreciate it.  

   Looking at the code maxSingleShuffleLimit is only used in determining if the copy _can_ fit into memory:

     boolean canFitInMemory(long requestedSize) {
        return (requestedSize < Integer.MAX_VALUE &&
                requestedSize < maxSingleShuffleLimit);
      }

    It also looks like the RamManager.reserve should wait until memory is available so it should hit a memory limit for that reason.

    What does seem a little strange to me is the following ( ReduceTask.java starting at 2730 ):

          // Inform the ram-manager
          ramManager.closeInMemoryFile(mapOutputLength);
          ramManager.unreserve(mapOutputLength);

          // Discard the map-output
          try {
            mapOutput.discard();
          } catch (IOException ignored) {
            LOG.info("Failed to discard map-output from " +
                     mapOutputLoc.getTaskAttemptId(), ignored);
          }
          mapOutput = null;

   So to me that looks like the ramManager unreserves the memory before the mapOutput is discarded.  Shouldn't the mapOutput be discarded _before_ the ramManager unreserves the memory?  If the memory is unreserved before the actual underlying data references are removed then it seems like another thread can try to allocate memory ( ReduceTask.java:2730 ) before the previous memory is disposed ( mapOutput.discard() ).  

   Not sure that makes sense.  One thing to note is that the particular job that is failing does have a good number ( 200k+ ) of map outputs.  The large number of small map outputs may be why we are triggering a problem.

   Thanks again for your thoughts.

   Andy
-----Original Message-----
From: Jacob R Rideout [mailto:[EMAIL PROTECTED]]
Sent: Sunday, March 07, 2010 1:21 PM
To: [EMAIL PROTECTED]
Cc: Andy Sautins; Ted Yu
Subject: Re: Shuffle In Memory OutOfMemoryError

Ted,

Thank you. I filled MAPREDUCE-1571 to cover this issue. I might have
some time to write a patch later this week.

Jacob Rideout

On Sat, Mar 6, 2010 at 11:37 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> I think there is mismatch (in ReduceTask.java) between:
>      this.numCopiers = conf.getInt("mapred.reduce.parallel.copies", 5);
> and:
>        maxSingleShuffleLimit = (long)(maxSize *
> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
> where MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION is 0.25f
>
> because
>      copiers = new ArrayList<MapOutputCopier>(numCopiers);
> so the total memory allocated for in-mem shuffle is 1.25 * maxSize
>
> A JIRA should be filed to correlate the constant 5 above and
> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION.
>
> Cheers
>
> On Sat, Mar 6, 2010 at 8:31 AM, Jacob R Rideout <[EMAIL PROTECTED]>wrote:
>
>> Hi all,
>>
>> We are seeing the following error in our reducers of a particular job:
>>
>> Error: java.lang.OutOfMemoryError: Java heap space
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
>>
>>
>> After enough reducers fail the entire job fails. This error occurs
>> regardless of whether mapred.compress.map.output is true. We were able
>> to avoid the issue by reducing mapred.job.shuffle.input.buffer.percent
>> to 20%. Shouldn't the framework via ShuffleRamManager.canFitInMemory
>> and.ShuffleRamManager.reserve correctly detect the the memory
>> available for allocation? I would think that with poor configuration
>> settings (and default settings in particular) the job may not be as
>> efficient, but wouldn't die.
>>
>> Here is some more context in the logs, I have attached the full