Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Shuffle In Memory OutOfMemoryError


+
Jacob R Rideout 2010-03-06, 16:31
+
Ted Yu 2010-03-07, 06:37
+
Jacob R Rideout 2010-03-07, 20:20
Copy link to this message
-
RE: Shuffle In Memory OutOfMemoryError

   Ted,

   I'm trying to follow the logic in your mail and I'm not sure I'm following.  If you would mind helping me understand I would appreciate it.  

   Looking at the code maxSingleShuffleLimit is only used in determining if the copy _can_ fit into memory:

     boolean canFitInMemory(long requestedSize) {
        return (requestedSize < Integer.MAX_VALUE &&
                requestedSize < maxSingleShuffleLimit);
      }

    It also looks like the RamManager.reserve should wait until memory is available so it should hit a memory limit for that reason.

    What does seem a little strange to me is the following ( ReduceTask.java starting at 2730 ):

          // Inform the ram-manager
          ramManager.closeInMemoryFile(mapOutputLength);
          ramManager.unreserve(mapOutputLength);

          // Discard the map-output
          try {
            mapOutput.discard();
          } catch (IOException ignored) {
            LOG.info("Failed to discard map-output from " +
                     mapOutputLoc.getTaskAttemptId(), ignored);
          }
          mapOutput = null;

   So to me that looks like the ramManager unreserves the memory before the mapOutput is discarded.  Shouldn't the mapOutput be discarded _before_ the ramManager unreserves the memory?  If the memory is unreserved before the actual underlying data references are removed then it seems like another thread can try to allocate memory ( ReduceTask.java:2730 ) before the previous memory is disposed ( mapOutput.discard() ).  

   Not sure that makes sense.  One thing to note is that the particular job that is failing does have a good number ( 200k+ ) of map outputs.  The large number of small map outputs may be why we are triggering a problem.

   Thanks again for your thoughts.

   Andy
-----Original Message-----
From: Jacob R Rideout [mailto:[EMAIL PROTECTED]]
Sent: Sunday, March 07, 2010 1:21 PM
To: [EMAIL PROTECTED]
Cc: Andy Sautins; Ted Yu
Subject: Re: Shuffle In Memory OutOfMemoryError

Ted,

Thank you. I filled MAPREDUCE-1571 to cover this issue. I might have
some time to write a patch later this week.

Jacob Rideout

On Sat, Mar 6, 2010 at 11:37 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> I think there is mismatch (in ReduceTask.java) between:
>      this.numCopiers = conf.getInt("mapred.reduce.parallel.copies", 5);
> and:
>        maxSingleShuffleLimit = (long)(maxSize *
> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION);
> where MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION is 0.25f
>
> because
>      copiers = new ArrayList<MapOutputCopier>(numCopiers);
> so the total memory allocated for in-mem shuffle is 1.25 * maxSize
>
> A JIRA should be filed to correlate the constant 5 above and
> MAX_SINGLE_SHUFFLE_SEGMENT_FRACTION.
>
> Cheers
>
> On Sat, Mar 6, 2010 at 8:31 AM, Jacob R Rideout <[EMAIL PROTECTED]>wrote:
>
>> Hi all,
>>
>> We are seeing the following error in our reducers of a particular job:
>>
>> Error: java.lang.OutOfMemoryError: Java heap space
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
>>        at
>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
>>
>>
>> After enough reducers fail the entire job fails. This error occurs
>> regardless of whether mapred.compress.map.output is true. We were able
>> to avoid the issue by reducing mapred.job.shuffle.input.buffer.percent
>> to 20%. Shouldn't the framework via ShuffleRamManager.canFitInMemory
>> and.ShuffleRamManager.reserve correctly detect the the memory
>> available for allocation? I would think that with poor configuration
>> settings (and default settings in particular) the job may not be as
>> efficient, but wouldn't die.
>>
>> Here is some more context in the logs, I have attached the full
+
Ted Yu 2010-03-07, 22:38
+
Andy Sautins 2010-03-07, 23:57
+
Ted Yu 2010-03-08, 03:40
+
Ted Yu 2010-03-09, 19:55
+
Andy Sautins 2010-03-09, 22:23
+
Ted Yu 2010-03-09, 22:33
+
Andy Sautins 2010-03-09, 22:41
+
baleksan@... 2010-03-09, 23:17
+
Christopher Douglas 2010-03-10, 00:19
+
Andy Sautins 2010-03-10, 01:01
+
Ted Yu 2010-03-10, 03:43
+
Christopher Douglas 2010-03-10, 07:51
+
Ted Yu 2010-03-10, 13:26
+
Chris Douglas 2010-03-10, 23:34
+
Ted Yu 2010-03-11, 03:48
+
Ted Yu 2010-03-11, 03:54
+
Bo Shi 2010-05-08, 03:08
+
Ted Yu 2010-05-08, 03:25
+
Alex Kozlov 2010-05-09, 04:41
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB