|
Kelly Burkhart
2011-02-16, 15:00
real great..
2011-02-16, 15:02
Kelly Burkhart
2011-02-16, 15:11
James Seigel
2011-02-16, 15:16
James Seigel
2011-02-16, 15:17
real great..
2011-02-16, 15:18
Jim Falgout
2011-02-16, 15:43
Kelly Burkhart
2011-02-16, 16:09
James Seigel
2011-02-16, 16:15
Kelly Burkhart
2011-02-16, 16:20
James Seigel
2011-02-16, 16:30
Kelly Burkhart
2011-02-16, 18:11
James Seigel
2011-02-16, 18:36
Rahul Jain
2011-02-16, 19:58
Kelly Burkhart
2011-02-16, 20:40
Harsh J
2011-02-17, 03:22
|
-
Reduce java.lang.OutOfMemoryErrorKelly Burkhart 2011-02-16, 15:00
Hello, I'm seeing frequent fails in reduce jobs with errors similar to this:
2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, decompressed len: 172488 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: attempt_201102081823_0175_r_000034_0 : Map output copy failure : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 172488 bytes (172492 raw bytes) into RAM from attempt_201102081823_0175_m_002153_0 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, decompressed len: 161940 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, decompressed len: 228361 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: Task attempt_201102081823_0175_r_000034_0: Failed fetch #1 from attempt_201102081823_0175_m_002153_0 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: attempt_201102081823_0175_r_000034_0 : Map output copy failure : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) Some also show this: Error: java.lang.OutOfMemoryError: GC overhead limit exceeded at sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) The particular job I'm running is an attempt to merge multiple time series files into a single file. The job tracker shows the following: Kind Num Tasks Complete Killed Failed/Killed Task Attempts map 15795 15795 0 0 / 29 reduce 100 30 70 17 / 29 All of the files I'm reading have records with a timestamp key similar to: 2011-01-03 08:30:00.457000<tab><record> My map job is a simple python program that ignores rows with times < 08:30:00 and > 15:00:00, determines the type of input row and writes it to stdout with very minor modification. It maintains no state and should not use any significant memory. My reducer is the IdentityReducer. The input files are individually gzipped then put into hdfs. The total uncompressed size of the output should be around 150G. Our cluster is 32 nodes each of which has 16G RAM and most of which have two 2T drives. We're running hadoop 0.20.2. Can anyone provide some insight on how we can eliminate this issue? I'm certain this email does not provide enough info, please let me know what further information is needed to troubleshoot. Thanks in advance, -Kelly
-
Re: Reduce java.lang.OutOfMemoryErrorreal great.. 2011-02-16, 15:02
Hi,
How many reducers are you using currently? Try increasing the number or reducers. Let me know if it helps. On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: > Hello, I'm seeing frequent fails in reduce jobs with errors similar to > this: > > > 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: > header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, > decompressed len: 172488 > 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: > attempt_201102081823_0175_r_000034_0 : Map output copy failure : > java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > > 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: > Shuffling 172488 bytes (172492 raw bytes) into RAM from > attempt_201102081823_0175_m_002153_0 > 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: > header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, > decompressed len: 161940 > 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: > header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, > decompressed len: 228361 > 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: Task > attempt_201102081823_0175_r_000034_0: Failed fetch #1 from > attempt_201102081823_0175_m_002153_0 > 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: > attempt_201102081823_0175_r_000034_0 : Map output copy failure : > java.lang.OutOfMemoryError: Java heap space > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > > Some also show this: > > Error: java.lang.OutOfMemoryError: GC overhead limit exceeded > at sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > > The particular job I'm running is an attempt to merge multiple time > series files into a single file. The job tracker shows the following: > > > Kind Num Tasks Complete Killed Failed/Killed Task Attempts > map 15795 15795 0 0 / 29 > reduce 100 30 70 17 / 29 > > All of the files I'm reading have records with a timestamp key similar to: > > 2011-01-03 08:30:00.457000<tab><record> > > My map job is a simple python program that ignores rows with times < > 08:30:00 and > 15:00:00, determines the type of input row and writes > it to stdout with very minor modification. It maintains no state and > should not use any significant memory. My reducer is the > IdentityReducer. The input files are individually gzipped then put Regards, R.V.
-
Re: Reduce java.lang.OutOfMemoryErrorKelly Burkhart 2011-02-16, 15:11
I have had it fail with a single reducer and with 100 reducers.
Ultimately it needs to be funneled to a single reducer though. -K On Wed, Feb 16, 2011 at 9:02 AM, real great.. <[EMAIL PROTECTED]> wrote: > Hi, > How many reducers are you using currently? > Try increasing the number or reducers. > Let me know if it helps. > > On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: > >> Hello, I'm seeing frequent fails in reduce jobs with errors similar to >> this: >> >> >> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >> decompressed len: 172488 >> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >> java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) >> >> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >> Shuffling 172488 bytes (172492 raw bytes) into RAM from >> attempt_201102081823_0175_m_002153_0 >> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, >> decompressed len: 161940 >> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, >> decompressed len: 228361 >> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: Task >> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from >> attempt_201102081823_0175_m_002153_0 >> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: >> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >> java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) >> >> Some also show this: >> >> Error: java.lang.OutOfMemoryError: GC overhead limit exceeded >> at sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) >> at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) >> at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) >> at >> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) >> >> The particular job I'm running is an attempt to merge multiple time >> series files into a single file. The job tracker shows the following: >> >> >> Kind Num Tasks Complete Killed Failed/Killed Task Attempts >> map 15795 15795 0 0 / 29 >> reduce 100 30 70 17 / 29 >> >> All of the files I'm reading have records with a timestamp key similar to: >> >> 2011-01-03 08:30:00.457000<tab><record> >> >> My map job is a simple python program that ignores rows with times <
-
Re: Reduce java.lang.OutOfMemoryErrorJames Seigel 2011-02-16, 15:16
Well the first thing I'd ask to see (if we can) is the code or a
description of what your reducer is doing. If it is holding on to objects too long or accumulating lists well then with the right amount of data you will run OOM. Another thought is that you've just not allocated enough mem for the reducer to run properly anyway. Try passing in a setting for the reducer that ups the memory for it. 768 perhaps. James Sent from my mobile. Please excuse the typos. On 2011-02-16, at 8:12 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > I have had it fail with a single reducer and with 100 reducers. > Ultimately it needs to be funneled to a single reducer though. > > -K > > On Wed, Feb 16, 2011 at 9:02 AM, real great.. > <[EMAIL PROTECTED]> wrote: >> Hi, >> How many reducers are you using currently? >> Try increasing the number or reducers. >> Let me know if it helps. >> >> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >> >>> Hello, I'm seeing frequent fails in reduce jobs with errors similar to >>> this: >>> >>> >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>> decompressed len: 172488 >>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>> java.lang.OutOfMemoryError: Java heap space >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) >>> >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>> Shuffling 172488 bytes (172492 raw bytes) into RAM from >>> attempt_201102081823_0175_m_002153_0 >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, >>> decompressed len: 161940 >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, >>> decompressed len: 228361 >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: Task >>> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from >>> attempt_201102081823_0175_m_002153_0 >>> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>> java.lang.OutOfMemoryError: Java heap space >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) >>> >>> Some also show this: >>> >>> Error: java.lang.OutOfMemoryError: GC overhead limit exceeded >>> at sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) >>> at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) >>> at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) >>> at >>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
-
Re: Reduce java.lang.OutOfMemoryErrorJames Seigel 2011-02-16, 15:17
...oh sorry I didn't scroll below the exception the first time. Try part 2
James Sent from my mobile. Please excuse the typos. On 2011-02-16, at 8:00 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > Hello, I'm seeing frequent fails in reduce jobs with errors similar to this: > > > 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: > header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, > decompressed len: 172488 > 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: > attempt_201102081823_0175_r_000034_0 : Map output copy failure : > java.lang.OutOfMemoryError: Java heap space > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > > 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: > Shuffling 172488 bytes (172492 raw bytes) into RAM from > attempt_201102081823_0175_m_002153_0 > 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: > header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, > decompressed len: 161940 > 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: > header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, > decompressed len: 228361 > 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: Task > attempt_201102081823_0175_r_000034_0: Failed fetch #1 from > attempt_201102081823_0175_m_002153_0 > 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: > attempt_201102081823_0175_r_000034_0 : Map output copy failure : > java.lang.OutOfMemoryError: Java heap space > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > > Some also show this: > > Error: java.lang.OutOfMemoryError: GC overhead limit exceeded > at sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) > at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > > The particular job I'm running is an attempt to merge multiple time > series files into a single file. The job tracker shows the following: > > > Kind Num Tasks Complete Killed Failed/Killed Task Attempts > map 15795 15795 0 0 / 29 > reduce 100 30 70 17 / 29 > > All of the files I'm reading have records with a timestamp key similar to: > > 2011-01-03 08:30:00.457000<tab><record> > > My map job is a simple python program that ignores rows with times < > 08:30:00 and > 15:00:00, determines the type of input row and writes > it to stdout with very minor modification. It maintains no state and > should not use any significant memory. My reducer is the > IdentityReducer. The input files are individually gzipped then put > into hdfs. The total uncompressed size of the output should be around
-
Re: Reduce java.lang.OutOfMemoryErrorreal great.. 2011-02-16, 15:18
another possibility could be increasing the memory allocated to jvm..not
sure how to do it though. On Wed, Feb 16, 2011 at 8:46 PM, James Seigel <[EMAIL PROTECTED]> wrote: > Well the first thing I'd ask to see (if we can) is the code or a > description of what your reducer is doing. > > If it is holding on to objects too long or accumulating lists well > then with the right amount of data you will run OOM. > > Another thought is that you've just not allocated enough mem for the > reducer to run properly anyway. Try passing in a setting for the > reducer that ups the memory for it. 768 perhaps. > > James > > Sent from my mobile. Please excuse the typos. > > On 2011-02-16, at 8:12 AM, Kelly Burkhart <[EMAIL PROTECTED]> > wrote: > > > I have had it fail with a single reducer and with 100 reducers. > > Ultimately it needs to be funneled to a single reducer though. > > > > -K > > > > On Wed, Feb 16, 2011 at 9:02 AM, real great.. > > <[EMAIL PROTECTED]> wrote: > >> Hi, > >> How many reducers are you using currently? > >> Try increasing the number or reducers. > >> Let me know if it helps. > >> > >> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart < > [EMAIL PROTECTED]>wrote: > >> > >>> Hello, I'm seeing frequent fails in reduce jobs with errors similar to > >>> this: > >>> > >>> > >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: > >>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, > >>> decompressed len: 172488 > >>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: > >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : > >>> java.lang.OutOfMemoryError: Java heap space > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > >>> > >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: > >>> Shuffling 172488 bytes (172492 raw bytes) into RAM from > >>> attempt_201102081823_0175_m_002153_0 > >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: > >>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, > >>> decompressed len: 161940 > >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: > >>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, > >>> decompressed len: 228361 > >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: Task > >>> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from > >>> attempt_201102081823_0175_m_002153_0 > >>> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: > >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : > >>> java.lang.OutOfMemoryError: Java heap space > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) > >>> at > >>> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) > >>> > >>> Some also show this: > >>> > >>> Error: java.lang.OutOfMemoryError: GC overhead limit exceeded > >>> at > sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) > >>> at > sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) > >>> at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) > >>> at > >>> > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072) Regards, R.V.
-
RE: Reduce java.lang.OutOfMemoryErrorJim Falgout 2011-02-16, 15:43
You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task.
-----Original Message----- From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 16, 2011 9:12 AM To: [EMAIL PROTECTED] Subject: Re: Reduce java.lang.OutOfMemoryError I have had it fail with a single reducer and with 100 reducers. Ultimately it needs to be funneled to a single reducer though. -K On Wed, Feb 16, 2011 at 9:02 AM, real great.. <[EMAIL PROTECTED]> wrote: > Hi, > How many reducers are you using currently? > Try increasing the number or reducers. > Let me know if it helps. > > On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: > >> Hello, I'm seeing frequent fails in reduce jobs with errors similar >> to >> this: >> >> >> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >> decompressed len: 172488 >> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >> java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >> fleInMemory(ReduceTask.java:1508) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >> apOutput(ReduceTask.java:1408) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >> Output(ReduceTask.java:1261) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >> ReduceTask.java:1195) >> >> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >> Shuffling 172488 bytes (172492 raw bytes) into RAM from >> attempt_201102081823_0175_m_002153_0 >> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, >> decompressed len: 161940 >> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, >> decompressed len: 228361 >> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >> Task >> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from >> attempt_201102081823_0175_m_002153_0 >> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: >> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >> java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >> fleInMemory(ReduceTask.java:1508) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >> apOutput(ReduceTask.java:1408) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >> Output(ReduceTask.java:1261) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >> ReduceTask.java:1195) >> >> Some also show this: >> >> Error: java.lang.OutOfMemoryError: GC overhead limit exceeded >> at >> sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) >> at >> sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) >> at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632) >> at >> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon >> nection.java:1072) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getI >> nputStream(ReduceTask.java:1447) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >> apOutput(ReduceTask.java:1349) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >> Output(ReduceTask.java:1261) >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(
-
Re: Reduce java.lang.OutOfMemoryErrorKelly Burkhart 2011-02-16, 16:09
Our clust admin (who's out of town today) has mapred.child.java.opts
set to -Xmx1280 in mapred-site.xml. However, if I go to the job configuration page for a job I'm running right now, it claims this option is set to -Xmx200m. There are other settings in mapred-site.xml that are different too. Why would map/reduce jobs not respect the mapred-site.xml file? -K On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout <[EMAIL PROTECTED]> wrote: > You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task. > > -----Original Message----- > From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, February 16, 2011 9:12 AM > To: [EMAIL PROTECTED] > Subject: Re: Reduce java.lang.OutOfMemoryError > > I have had it fail with a single reducer and with 100 reducers. > Ultimately it needs to be funneled to a single reducer though. > > -K > > On Wed, Feb 16, 2011 at 9:02 AM, real great.. > <[EMAIL PROTECTED]> wrote: >> Hi, >> How many reducers are you using currently? >> Try increasing the number or reducers. >> Let me know if it helps. >> >> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >> >>> Hello, I'm seeing frequent fails in reduce jobs with errors similar >>> to >>> this: >>> >>> >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>> decompressed len: 172488 >>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>> java.lang.OutOfMemoryError: Java heap space >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>> fleInMemory(ReduceTask.java:1508) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >>> apOutput(ReduceTask.java:1408) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >>> Output(ReduceTask.java:1261) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >>> ReduceTask.java:1195) >>> >>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>> Shuffling 172488 bytes (172492 raw bytes) into RAM from >>> attempt_201102081823_0175_m_002153_0 >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, >>> decompressed len: 161940 >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, >>> decompressed len: 228361 >>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>> Task >>> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from >>> attempt_201102081823_0175_m_002153_0 >>> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: >>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>> java.lang.OutOfMemoryError: Java heap space >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>> fleInMemory(ReduceTask.java:1508) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >>> apOutput(ReduceTask.java:1408) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >>> Output(ReduceTask.java:1261) >>> at >>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >>> ReduceTask.java:1195) >>> >>> Some also show this: >>> >>> Error: java.lang.OutOfMemoryError: GC overhead limit exceeded >>> at >>> sun.net.www.http.ChunkedInputStream.(ChunkedInputStream.java:63) >>> at >>> sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:811) >>> at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
-
Re: Reduce java.lang.OutOfMemoryErrorJames Seigel 2011-02-16, 16:15
He might not have that conf distributed out to each machine
Sent from my mobile. Please excuse the typos. On 2011-02-16, at 9:10 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > Our clust admin (who's out of town today) has mapred.child.java.opts > set to -Xmx1280 in mapred-site.xml. However, if I go to the job > configuration page for a job I'm running right now, it claims this > option is set to -Xmx200m. There are other settings in > mapred-site.xml that are different too. Why would map/reduce jobs not > respect the mapred-site.xml file? > > -K > > On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout <[EMAIL PROTECTED]> wrote: >> You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task. >> >> -----Original Message----- >> From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] >> Sent: Wednesday, February 16, 2011 9:12 AM >> To: [EMAIL PROTECTED] >> Subject: Re: Reduce java.lang.OutOfMemoryError >> >> I have had it fail with a single reducer and with 100 reducers. >> Ultimately it needs to be funneled to a single reducer though. >> >> -K >> >> On Wed, Feb 16, 2011 at 9:02 AM, real great.. >> <[EMAIL PROTECTED]> wrote: >>> Hi, >>> How many reducers are you using currently? >>> Try increasing the number or reducers. >>> Let me know if it helps. >>> >>> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >>> >>>> Hello, I'm seeing frequent fails in reduce jobs with errors similar >>>> to >>>> this: >>>> >>>> >>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>>> decompressed len: 172488 >>>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>> java.lang.OutOfMemoryError: Java heap space >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>> fleInMemory(ReduceTask.java:1508) >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >>>> apOutput(ReduceTask.java:1408) >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >>>> Output(ReduceTask.java:1261) >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >>>> ReduceTask.java:1195) >>>> >>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>> Shuffling 172488 bytes (172492 raw bytes) into RAM from >>>> attempt_201102081823_0175_m_002153_0 >>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, >>>> decompressed len: 161940 >>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, >>>> decompressed len: 228361 >>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>> Task >>>> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from >>>> attempt_201102081823_0175_m_002153_0 >>>> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: >>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>> java.lang.OutOfMemoryError: Java heap space >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>> fleInMemory(ReduceTask.java:1508) >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >>>> apOutput(ReduceTask.java:1408) >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >>>> Output(ReduceTask.java:1261) >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >>>> ReduceTask.java:1195) >>>> >>>> Some also show this:
-
Re: Reduce java.lang.OutOfMemoryErrorKelly Burkhart 2011-02-16, 16:20
I should have mentioned this in my last email: I thought of that so I
logged into every machine in the cluster; each machine's mapred-site.xml has the same md5sum. On Wed, Feb 16, 2011 at 10:15 AM, James Seigel <[EMAIL PROTECTED]> wrote: > He might not have that conf distributed out to each machine > > > Sent from my mobile. Please excuse the typos. > > On 2011-02-16, at 9:10 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > >> Our clust admin (who's out of town today) has mapred.child.java.opts >> set to -Xmx1280 in mapred-site.xml. However, if I go to the job >> configuration page for a job I'm running right now, it claims this >> option is set to -Xmx200m. There are other settings in >> mapred-site.xml that are different too. Why would map/reduce jobs not >> respect the mapred-site.xml file? >> >> -K >> >> On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout <[EMAIL PROTECTED]> wrote: >>> You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task. >>> >>> -----Original Message----- >>> From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] >>> Sent: Wednesday, February 16, 2011 9:12 AM >>> To: [EMAIL PROTECTED] >>> Subject: Re: Reduce java.lang.OutOfMemoryError >>> >>> I have had it fail with a single reducer and with 100 reducers. >>> Ultimately it needs to be funneled to a single reducer though. >>> >>> -K >>> >>> On Wed, Feb 16, 2011 at 9:02 AM, real great.. >>> <[EMAIL PROTECTED]> wrote: >>>> Hi, >>>> How many reducers are you using currently? >>>> Try increasing the number or reducers. >>>> Let me know if it helps. >>>> >>>> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >>>> >>>>> Hello, I'm seeing frequent fails in reduce jobs with errors similar >>>>> to >>>>> this: >>>>> >>>>> >>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>>>> decompressed len: 172488 >>>>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>>> java.lang.OutOfMemoryError: Java heap space >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>>> fleInMemory(ReduceTask.java:1508) >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >>>>> apOutput(ReduceTask.java:1408) >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >>>>> Output(ReduceTask.java:1261) >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >>>>> ReduceTask.java:1195) >>>>> >>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>>> Shuffling 172488 bytes (172492 raw bytes) into RAM from >>>>> attempt_201102081823_0175_m_002153_0 >>>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, >>>>> decompressed len: 161940 >>>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, >>>>> decompressed len: 228361 >>>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>>> Task >>>>> attempt_201102081823_0175_r_000034_0: Failed fetch #1 from >>>>> attempt_201102081823_0175_m_002153_0 >>>>> 2011-02-15 15:21:10,424 FATAL org.apache.hadoop.mapred.TaskRunner: >>>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>>> java.lang.OutOfMemoryError: Java heap space >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>>> fleInMemory(ReduceTask.java:1508) >>>>> at >>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM
-
Re: Reduce java.lang.OutOfMemoryErrorJames Seigel 2011-02-16, 16:30
Hrmmm. Well as you've pointed out. 200m is quite small and is probably
the cause. Now thEre might be some overriding settings in something you are using to launch or something. You could set those values in the config to not be overridden in the main conf then see what tries to override it in the logs Cheers James Sent from my mobile. Please excuse the typos. On 2011-02-16, at 9:21 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > I should have mentioned this in my last email: I thought of that so I > logged into every machine in the cluster; each machine's > mapred-site.xml has the same md5sum. > > On Wed, Feb 16, 2011 at 10:15 AM, James Seigel <[EMAIL PROTECTED]> wrote: >> He might not have that conf distributed out to each machine >> >> >> Sent from my mobile. Please excuse the typos. >> >> On 2011-02-16, at 9:10 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: >> >>> Our clust admin (who's out of town today) has mapred.child.java.opts >>> set to -Xmx1280 in mapred-site.xml. However, if I go to the job >>> configuration page for a job I'm running right now, it claims this >>> option is set to -Xmx200m. There are other settings in >>> mapred-site.xml that are different too. Why would map/reduce jobs not >>> respect the mapred-site.xml file? >>> >>> -K >>> >>> On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout <[EMAIL PROTECTED]> wrote: >>>> You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task. >>>> >>>> -----Original Message----- >>>> From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] >>>> Sent: Wednesday, February 16, 2011 9:12 AM >>>> To: [EMAIL PROTECTED] >>>> Subject: Re: Reduce java.lang.OutOfMemoryError >>>> >>>> I have had it fail with a single reducer and with 100 reducers. >>>> Ultimately it needs to be funneled to a single reducer though. >>>> >>>> -K >>>> >>>> On Wed, Feb 16, 2011 at 9:02 AM, real great.. >>>> <[EMAIL PROTECTED]> wrote: >>>>> Hi, >>>>> How many reducers are you using currently? >>>>> Try increasing the number or reducers. >>>>> Let me know if it helps. >>>>> >>>>> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >>>>> >>>>>> Hello, I'm seeing frequent fails in reduce jobs with errors similar >>>>>> to >>>>>> this: >>>>>> >>>>>> >>>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>>>>> decompressed len: 172488 >>>>>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>>>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> at >>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>>>> fleInMemory(ReduceTask.java:1508) >>>>>> at >>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >>>>>> apOutput(ReduceTask.java:1408) >>>>>> at >>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >>>>>> Output(ReduceTask.java:1261) >>>>>> at >>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run( >>>>>> ReduceTask.java:1195) >>>>>> >>>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>> Shuffling 172488 bytes (172492 raw bytes) into RAM from >>>>>> attempt_201102081823_0175_m_002153_0 >>>>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>> header: attempt_201102081823_0175_m_002118_0, compressed len: 161944, >>>>>> decompressed len: 161940 >>>>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>> header: attempt_201102081823_0175_m_001704_0, compressed len: 228365, >>>>>> decompressed len: 228361 >>>>>> 2011-02-15 15:21:10,424 INFO org.apache.hadoop.mapred.ReduceTask:
-
Re: Reduce java.lang.OutOfMemoryErrorKelly Burkhart 2011-02-16, 18:11
OK, the job was preferring the config file on my local machine which
is not part of the cluster over the cluster config files. That seems completely broken to me; my config was basically empty other than containing the location of the cluster and my job apparently used defaults rather than the cluster config. It doesn't make sense to me to keep configuration files synchronized on every machine that may access the cluster. I'm running again; we'll see if it completes this time. -K On Wed, Feb 16, 2011 at 10:30 AM, James Seigel <[EMAIL PROTECTED]> wrote: > Hrmmm. Well as you've pointed out. 200m is quite small and is probably > the cause. > > Now thEre might be some overriding settings in something you are using > to launch or something. > > You could set those values in the config to not be overridden in the > main conf then see what tries to override it in the logs > > Cheers > James > > Sent from my mobile. Please excuse the typos. > > On 2011-02-16, at 9:21 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > >> I should have mentioned this in my last email: I thought of that so I >> logged into every machine in the cluster; each machine's >> mapred-site.xml has the same md5sum. >> >> On Wed, Feb 16, 2011 at 10:15 AM, James Seigel <[EMAIL PROTECTED]> wrote: >>> He might not have that conf distributed out to each machine >>> >>> >>> Sent from my mobile. Please excuse the typos. >>> >>> On 2011-02-16, at 9:10 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: >>> >>>> Our clust admin (who's out of town today) has mapred.child.java.opts >>>> set to -Xmx1280 in mapred-site.xml. However, if I go to the job >>>> configuration page for a job I'm running right now, it claims this >>>> option is set to -Xmx200m. There are other settings in >>>> mapred-site.xml that are different too. Why would map/reduce jobs not >>>> respect the mapred-site.xml file? >>>> >>>> -K >>>> >>>> On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout <[EMAIL PROTECTED]> wrote: >>>>> You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task. >>>>> >>>>> -----Original Message----- >>>>> From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] >>>>> Sent: Wednesday, February 16, 2011 9:12 AM >>>>> To: [EMAIL PROTECTED] >>>>> Subject: Re: Reduce java.lang.OutOfMemoryError >>>>> >>>>> I have had it fail with a single reducer and with 100 reducers. >>>>> Ultimately it needs to be funneled to a single reducer though. >>>>> >>>>> -K >>>>> >>>>> On Wed, Feb 16, 2011 at 9:02 AM, real great.. >>>>> <[EMAIL PROTECTED]> wrote: >>>>>> Hi, >>>>>> How many reducers are you using currently? >>>>>> Try increasing the number or reducers. >>>>>> Let me know if it helps. >>>>>> >>>>>> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >>>>>> >>>>>>> Hello, I'm seeing frequent fails in reduce jobs with errors similar >>>>>>> to >>>>>>> this: >>>>>>> >>>>>>> >>>>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>>>>>> decompressed len: 172488 >>>>>>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>>>>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>>> at >>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>>>>> fleInMemory(ReduceTask.java:1508) >>>>>>> at >>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM >>>>>>> apOutput(ReduceTask.java:1408) >>>>>>> at >>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copy >>>>>>> Output(ReduceTask.java:1261) >>>>>>> at >>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(
-
Re: Reduce java.lang.OutOfMemoryErrorJames Seigel 2011-02-16, 18:36
Good luck.
Let me know how it goes. James Sent from my mobile. Please excuse the typos. On 2011-02-16, at 11:11 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > OK, the job was preferring the config file on my local machine which > is not part of the cluster over the cluster config files. That seems > completely broken to me; my config was basically empty other than > containing the location of the cluster and my job apparently used > defaults rather than the cluster config. It doesn't make sense to me > to keep configuration files synchronized on every machine that may > access the cluster. > > I'm running again; we'll see if it completes this time. > > -K > > On Wed, Feb 16, 2011 at 10:30 AM, James Seigel <[EMAIL PROTECTED]> wrote: >> Hrmmm. Well as you've pointed out. 200m is quite small and is probably >> the cause. >> >> Now thEre might be some overriding settings in something you are using >> to launch or something. >> >> You could set those values in the config to not be overridden in the >> main conf then see what tries to override it in the logs >> >> Cheers >> James >> >> Sent from my mobile. Please excuse the typos. >> >> On 2011-02-16, at 9:21 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: >> >>> I should have mentioned this in my last email: I thought of that so I >>> logged into every machine in the cluster; each machine's >>> mapred-site.xml has the same md5sum. >>> >>> On Wed, Feb 16, 2011 at 10:15 AM, James Seigel <[EMAIL PROTECTED]> wrote: >>>> He might not have that conf distributed out to each machine >>>> >>>> >>>> Sent from my mobile. Please excuse the typos. >>>> >>>> On 2011-02-16, at 9:10 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: >>>> >>>>> Our clust admin (who's out of town today) has mapred.child.java.opts >>>>> set to -Xmx1280 in mapred-site.xml. However, if I go to the job >>>>> configuration page for a job I'm running right now, it claims this >>>>> option is set to -Xmx200m. There are other settings in >>>>> mapred-site.xml that are different too. Why would map/reduce jobs not >>>>> respect the mapred-site.xml file? >>>>> >>>>> -K >>>>> >>>>> On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout <[EMAIL PROTECTED]> wrote: >>>>>> You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task. >>>>>> >>>>>> -----Original Message----- >>>>>> From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] >>>>>> Sent: Wednesday, February 16, 2011 9:12 AM >>>>>> To: [EMAIL PROTECTED] >>>>>> Subject: Re: Reduce java.lang.OutOfMemoryError >>>>>> >>>>>> I have had it fail with a single reducer and with 100 reducers. >>>>>> Ultimately it needs to be funneled to a single reducer though. >>>>>> >>>>>> -K >>>>>> >>>>>> On Wed, Feb 16, 2011 at 9:02 AM, real great.. >>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>> Hi, >>>>>>> How many reducers are you using currently? >>>>>>> Try increasing the number or reducers. >>>>>>> Let me know if it helps. >>>>>>> >>>>>>> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >>>>>>> >>>>>>>> Hello, I'm seeing frequent fails in reduce jobs with errors similar >>>>>>>> to >>>>>>>> this: >>>>>>>> >>>>>>>> >>>>>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>>>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>>>>>>> decompressed len: 172488 >>>>>>>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>>>>>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>>>> at >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>>>>>> fleInMemory(ReduceTask.java:1508) >>>>>>>> at >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getM
-
Re: Reduce java.lang.OutOfMemoryErrorRahul Jain 2011-02-16, 19:58
If you google for such memory failures, you'll find the mapreduce tunable
that'll help you: mapred.job.shuffle.input.buffer.percent ; it is well known that the default values in hadoop config don't work well for large data systems -Rahul On Wed, Feb 16, 2011 at 10:36 AM, James Seigel <[EMAIL PROTECTED]> wrote: > Good luck. > > Let me know how it goes. > > James > > Sent from my mobile. Please excuse the typos. > > On 2011-02-16, at 11:11 AM, Kelly Burkhart <[EMAIL PROTECTED]> > wrote: > > > OK, the job was preferring the config file on my local machine which > > is not part of the cluster over the cluster config files. That seems > > completely broken to me; my config was basically empty other than > > containing the location of the cluster and my job apparently used > > defaults rather than the cluster config. It doesn't make sense to me > > to keep configuration files synchronized on every machine that may > > access the cluster. > > > > I'm running again; we'll see if it completes this time. > > > > -K > > > > On Wed, Feb 16, 2011 at 10:30 AM, James Seigel <[EMAIL PROTECTED]> wrote: > >> Hrmmm. Well as you've pointed out. 200m is quite small and is probably > >> the cause. > >> > >> Now thEre might be some overriding settings in something you are using > >> to launch or something. > >> > >> You could set those values in the config to not be overridden in the > >> main conf then see what tries to override it in the logs > >> > >> Cheers > >> James > >> > >> Sent from my mobile. Please excuse the typos. > >> > >> On 2011-02-16, at 9:21 AM, Kelly Burkhart <[EMAIL PROTECTED]> > wrote: > >> > >>> I should have mentioned this in my last email: I thought of that so I > >>> logged into every machine in the cluster; each machine's > >>> mapred-site.xml has the same md5sum. > >>> > >>> On Wed, Feb 16, 2011 at 10:15 AM, James Seigel <[EMAIL PROTECTED]> wrote: > >>>> He might not have that conf distributed out to each machine > >>>> > >>>> > >>>> Sent from my mobile. Please excuse the typos. > >>>> > >>>> On 2011-02-16, at 9:10 AM, Kelly Burkhart <[EMAIL PROTECTED]> > wrote: > >>>> > >>>>> Our clust admin (who's out of town today) has mapred.child.java.opts > >>>>> set to -Xmx1280 in mapred-site.xml. However, if I go to the job > >>>>> configuration page for a job I'm running right now, it claims this > >>>>> option is set to -Xmx200m. There are other settings in > >>>>> mapred-site.xml that are different too. Why would map/reduce jobs > not > >>>>> respect the mapred-site.xml file? > >>>>> > >>>>> -K > >>>>> > >>>>> On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout < > [EMAIL PROTECTED]> wrote: > >>>>>> You can set the amount of memory used by the reducer using the > mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override > it in your job. You can set it to something like: -Xm512M to increase the > amount of memory used by the JVM spawned for the reducer task. > >>>>>> > >>>>>> -----Original Message----- > >>>>>> From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] > >>>>>> Sent: Wednesday, February 16, 2011 9:12 AM > >>>>>> To: [EMAIL PROTECTED] > >>>>>> Subject: Re: Reduce java.lang.OutOfMemoryError > >>>>>> > >>>>>> I have had it fail with a single reducer and with 100 reducers. > >>>>>> Ultimately it needs to be funneled to a single reducer though. > >>>>>> > >>>>>> -K > >>>>>> > >>>>>> On Wed, Feb 16, 2011 at 9:02 AM, real great.. > >>>>>> <[EMAIL PROTECTED]> wrote: > >>>>>>> Hi, > >>>>>>> How many reducers are you using currently? > >>>>>>> Try increasing the number or reducers. > >>>>>>> Let me know if it helps. > >>>>>>> > >>>>>>> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart < > [EMAIL PROTECTED]>wrote: > >>>>>>> > >>>>>>>> Hello, I'm seeing frequent fails in reduce jobs with errors > similar > >>>>>>>> to > >>>>>>>> this: > >>>>>>>> > >>>>>>>> > >>>>>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: > >>>>>>>> header: attempt_201102081823_0175_m_002153_0, compressed len:
-
Re: Reduce java.lang.OutOfMemoryErrorKelly Burkhart 2011-02-16, 20:40
Thank you for the hint. I'm fairly new to this so nothing is well
known to me at this time ;-) -K On Wed, Feb 16, 2011 at 1:58 PM, Rahul Jain <[EMAIL PROTECTED]> wrote: > If you google for such memory failures, you'll find the mapreduce tunable > that'll help you: > > mapred.job.shuffle.input.buffer.percent ; it is well known that the default > values in hadoop config > > don't work well for large data systems >
-
Re: Reduce java.lang.OutOfMemoryErrorHarsh J 2011-02-17, 03:22
Which is why setting cluster values to final helps. See
http://wiki.apache.org/hadoop/FAQ#How_do_I_get_my_MapReduce_Java_Program_to_read_the_Cluster.27s_set_configuration_and_not_just_defaults.3F On Wed, Feb 16, 2011 at 11:41 PM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: > OK, the job was preferring the config file on my local machine which > is not part of the cluster over the cluster config files. That seems > completely broken to me; my config was basically empty other than > containing the location of the cluster and my job apparently used > defaults rather than the cluster config. It doesn't make sense to me > to keep configuration files synchronized on every machine that may > access the cluster. > > I'm running again; we'll see if it completes this time. > > -K > > On Wed, Feb 16, 2011 at 10:30 AM, James Seigel <[EMAIL PROTECTED]> wrote: >> Hrmmm. Well as you've pointed out. 200m is quite small and is probably >> the cause. >> >> Now thEre might be some overriding settings in something you are using >> to launch or something. >> >> You could set those values in the config to not be overridden in the >> main conf then see what tries to override it in the logs >> >> Cheers >> James >> >> Sent from my mobile. Please excuse the typos. >> >> On 2011-02-16, at 9:21 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: >> >>> I should have mentioned this in my last email: I thought of that so I >>> logged into every machine in the cluster; each machine's >>> mapred-site.xml has the same md5sum. >>> >>> On Wed, Feb 16, 2011 at 10:15 AM, James Seigel <[EMAIL PROTECTED]> wrote: >>>> He might not have that conf distributed out to each machine >>>> >>>> >>>> Sent from my mobile. Please excuse the typos. >>>> >>>> On 2011-02-16, at 9:10 AM, Kelly Burkhart <[EMAIL PROTECTED]> wrote: >>>> >>>>> Our clust admin (who's out of town today) has mapred.child.java.opts >>>>> set to -Xmx1280 in mapred-site.xml. However, if I go to the job >>>>> configuration page for a job I'm running right now, it claims this >>>>> option is set to -Xmx200m. There are other settings in >>>>> mapred-site.xml that are different too. Why would map/reduce jobs not >>>>> respect the mapred-site.xml file? >>>>> >>>>> -K >>>>> >>>>> On Wed, Feb 16, 2011 at 9:43 AM, Jim Falgout <[EMAIL PROTECTED]> wrote: >>>>>> You can set the amount of memory used by the reducer using the mapreduce.reduce.java.opts property. Set it in mapred-site.xml or override it in your job. You can set it to something like: -Xm512M to increase the amount of memory used by the JVM spawned for the reducer task. >>>>>> >>>>>> -----Original Message----- >>>>>> From: Kelly Burkhart [mailto:[EMAIL PROTECTED]] >>>>>> Sent: Wednesday, February 16, 2011 9:12 AM >>>>>> To: [EMAIL PROTECTED] >>>>>> Subject: Re: Reduce java.lang.OutOfMemoryError >>>>>> >>>>>> I have had it fail with a single reducer and with 100 reducers. >>>>>> Ultimately it needs to be funneled to a single reducer though. >>>>>> >>>>>> -K >>>>>> >>>>>> On Wed, Feb 16, 2011 at 9:02 AM, real great.. >>>>>> <[EMAIL PROTECTED]> wrote: >>>>>>> Hi, >>>>>>> How many reducers are you using currently? >>>>>>> Try increasing the number or reducers. >>>>>>> Let me know if it helps. >>>>>>> >>>>>>> On Wed, Feb 16, 2011 at 8:30 PM, Kelly Burkhart <[EMAIL PROTECTED]>wrote: >>>>>>> >>>>>>>> Hello, I'm seeing frequent fails in reduce jobs with errors similar >>>>>>>> to >>>>>>>> this: >>>>>>>> >>>>>>>> >>>>>>>> 2011-02-15 15:21:10,163 INFO org.apache.hadoop.mapred.ReduceTask: >>>>>>>> header: attempt_201102081823_0175_m_002153_0, compressed len: 172492, >>>>>>>> decompressed len: 172488 >>>>>>>> 2011-02-15 15:21:10,163 FATAL org.apache.hadoop.mapred.TaskRunner: >>>>>>>> attempt_201102081823_0175_r_000034_0 : Map output copy failure : >>>>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>>>> at >>>>>>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuf >>>>>>>> fleInMemory(ReduceTask.java:1508) Harsh J www.harshj.com |