Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> About reducer's Shuffle JVM Heap Size

Copy link to this message
About reducer's Shuffle JVM Heap Size
Hi, all.
I'm debugging Hadoop's source code and find an incomprehensible setting.
In hadoop-0.20.2 and hadoop-1.0.3, reducer's shuffle buffer size cannot
exceed 2048MB (i.e., Integer.MAX_VALUE).  In this way, although*//*
reducer's JVM size can be set more than 2048MB (e.g.,
mapred.child.java.opts=-Xmx4000m), the heap size used for shuffle buffer
is at most "2048MB * maxInMemCopyUse (default 0.7)" not "4000MB *
I think it's not a reasonable setting for large memory machines.

The following code taken from "org.apache.hadoop.mapred.ReduceTask"
shows the concrete algorithm.
I'm wondering why maxSize is declared as "long" but set as "(int)". I
point out the corresponding code  with "-->"

Thanks for any suggestion.
       private final long maxSize;
       private final long maxSingleShuffleLimit;

       private long size = 0;

       private Object dataAvailable = new Object();
       private long fullSize = 0;
       private int numPendingRequests = 0;
       private int numRequiredMapOutputs = 0;
       private int numClosed = 0;
       private boolean closed = false;

       public ShuffleRamManager(Configuration conf) throws IOException {
         final float maxInMemCopyUse conf.getFloat("mapred.job.shuffle.input.buffer.percent", 0.70f);
         if (maxInMemCopyUse > 1.0 || maxInMemCopyUse < 0.0) {
           throw new IOException("mapred.job.shuffle.input.buffer.percent" +
         // Allow unit tests to fix Runtime memory
-->   maxSize = (int)(conf.getInt("mapred.job.reduce.total.mem.bytes",
-->        (int)Math.min(Runtime.getRuntime().maxMemory(),
-->      * maxInMemCopyUse);
         maxSingleShuffleLimit = (long)(maxSize *
         LOG.info("ShuffleRamManager: MemoryLimit=" + maxSize +
                  ", MaxSingleShuffleLimit=" + maxSingleShuffleLimit);