Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Shuffle In Memory OutOfMemoryError


Copy link to this message
-
RE: Shuffle In Memory OutOfMemoryError

   Ah.  My mistake.  We will apply the patch manually to 0.20.2 and re-run.  Just out of curiosity, why do the release notes for 0.20.2 indicate that MAPREDUCE-1182 is included in the release, but the patch needs to be applied manually.  Is there an additional part of the patch not included in the release?

   Thanks for your help.

   Andy

-----Original Message-----
From: Ted Yu [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, March 09, 2010 3:33 PM
To: [EMAIL PROTECTED]
Subject: Re: Shuffle In Memory OutOfMemoryError

Andy:
You need to manually apply the patch.

Cheers

On Tue, Mar 9, 2010 at 2:23 PM, Andy Sautins <[EMAIL PROTECTED]>wrote:

>
>   Thanks Ted.  My understanding is that MAPREDUCE-1182 is included in the
> 0.20.2 release.  We upgraded our cluster to 0.20.2 this weekend and re-ran
> the same job scenarios.  Running with mapred.reduce.parallel.copies set to 1
> and continue to have the same Java heap space error.
>
>
>
> -----Original Message-----
> From: Ted Yu [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, March 09, 2010 12:56 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Shuffle In Memory OutOfMemoryError
>
> This issue has been resolved in
> http://issues.apache.org/jira/browse/MAPREDUCE-1182
>
> Please apply the patch
> M1182-1v20.patch<
> http://issues.apache.org/jira/secure/attachment/12424116/M1182-1v20.patch>
>
> On Sun, Mar 7, 2010 at 3:57 PM, Andy Sautins <[EMAIL PROTECTED]
> >wrote:
>
> >
> >  Thanks Ted.  Very helpful.  You are correct that I misunderstood the
> code
> > at ReduceTask.java:1535.  I missed the fact that it's in a IOException
> catch
> > block.  My mistake.  That's what I get for being in a rush.
> >
> >  For what it's worth I did re-run the job with
> > mapred.reduce.parallel.copies set with values from 5 all the way down to
> 1.
> >  All failed with the same error:
> >
> > Error: java.lang.OutOfMemoryError: Java heap space
> >        at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
> >        at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
> >        at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
> >        at
> >
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
> >
> >
> >   So from that it does seem like something else might be going on, yes?
>  I
> > need to do some more research.
> >
> >  I appreciate your insights.
> >
> >  Andy
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:[EMAIL PROTECTED]]
> > Sent: Sunday, March 07, 2010 3:38 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Shuffle In Memory OutOfMemoryError
> >
> > My observation is based on this call chain:
> > MapOutputCopier.run() calling copyOutput() calling getMapOutput() calling
> > ramManager.canFitInMemory(decompressedLength)
> >
> > Basically ramManager.canFitInMemory() makes decision without considering
> > the
> > number of MapOutputCopiers that are running. Thus 1.25 * 0.7 of total
> heap
> > may be used in shuffling if default parameters were used.
> > Of course, you should check the value for mapred.reduce.parallel.copies
> to
> > see if it is 5. If it is 4 or lower, my reasoning wouldn't apply.
> >
> > About ramManager.unreserve() call, ReduceTask.java from hadoop 0.20.2
> only
> > has 2731 lines. So I have to guess the location of the code snippet you
> > provided.
> > I found this around line 1535:
> >        } catch (IOException ioe) {
> >          LOG.info("Failed to shuffle from " +
> > mapOutputLoc.getTaskAttemptId(),
> >                   ioe);
> >
> >          // Inform the ram-manager
> >          ramManager.closeInMemoryFile(mapOutputLength);
> >          ramManager.unreserve(mapOutputLength);
> >
> >          // Discard the map-output
> >          try {
> >            mapOutput.discard();
> >          } catch (IOException ignored) {