|
Bradford Stephens
2010-09-26, 07:55
Bradford Stephens
2010-09-26, 08:00
Bradford Stephens
2010-09-26, 10:30
Ted Yu
2010-09-26, 13:47
Chris K Wensel
2010-09-26, 15:10
Ted Dunning
2010-09-26, 16:35
Ted Dunning
2010-09-26, 16:37
Bradford Stephens
2010-09-26, 20:19
Ted Yu
2010-09-26, 21:35
Bradford Stephens
2010-09-26, 23:46
Chris K Wensel
2010-09-27, 00:09
Bradford Stephens
2010-09-27, 00:37
Alex Kozlov
2010-09-27, 00:41
Bradford Stephens
2010-09-27, 01:01
Ted Dunning
2010-09-27, 02:00
Vitaliy Semochkin
2010-09-27, 09:20
Bradford Stephens
2010-09-27, 09:46
Bharath Mundlapudi
2010-09-27, 18:24
|
-
java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-26, 07:55
Greetings,
I'm running into a brain-numbing problem on Elastic MapReduce. I'm running a decent-size task (22,000 mappers, a ton of GZipped input blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). I get failures randomly --- sometimes at the end of my 6-step process, sometimes at the first reducer phase, sometimes in the mapper. It seems to fail in multiple areas. Mostly in the reducers. Any ideas? Here's the settings I've changed: -Xmx400m 6 max mappers 1 max reducer 1GB swap partition mapred.job.reuse.jvm.num.tasks=50 mapred.reduce.parallel.copies=3 java.lang.OutOfMemoryError: GC overhead limit exceeded at java.nio.CharBuffer.wrap(CharBuffer.java:350) at java.nio.CharBuffer.wrap(CharBuffer.java:373) at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) at java.lang.StringCoding.decode(StringCoding.java:173) at java.lang.String.(String.java:443) at java.lang.String.(String.java:515) at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) at cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) at cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) at cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) at cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-26, 08:00
I'm going to try running it on high-RAM boxes with -Xmx4096m or so,
see if that helps. On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens <[EMAIL PROTECTED]> wrote: > Greetings, > > I'm running into a brain-numbing problem on Elastic MapReduce. I'm > running a decent-size task (22,000 mappers, a ton of GZipped input > blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). > > I get failures randomly --- sometimes at the end of my 6-step process, > sometimes at the first reducer phase, sometimes in the mapper. It > seems to fail in multiple areas. Mostly in the reducers. Any ideas? > > Here's the settings I've changed: > -Xmx400m > 6 max mappers > 1 max reducer > 1GB swap partition > mapred.job.reuse.jvm.num.tasks=50 > mapred.reduce.parallel.copies=3 > > > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.nio.CharBuffer.wrap(CharBuffer.java:350) > at java.nio.CharBuffer.wrap(CharBuffer.java:373) > at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) > at java.lang.StringCoding.decode(StringCoding.java:173) > at java.lang.String.(String.java:443) > at java.lang.String.(String.java:515) > at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) > at cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) > at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) > at cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) > at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) > at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) > at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) > at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) > at cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) > at cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) > at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) > at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) > at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) > at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) > at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) > > -- > Bradford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > solution. Process, store, query, search, and serve all your data. > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-26, 10:30
Nope, that didn't seem to help.
On Sun, Sep 26, 2010 at 1:00 AM, Bradford Stephens <[EMAIL PROTECTED]> wrote: > I'm going to try running it on high-RAM boxes with -Xmx4096m or so, > see if that helps. > > On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens > <[EMAIL PROTECTED]> wrote: >> Greetings, >> >> I'm running into a brain-numbing problem on Elastic MapReduce. I'm >> running a decent-size task (22,000 mappers, a ton of GZipped input >> blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). >> >> I get failures randomly --- sometimes at the end of my 6-step process, >> sometimes at the first reducer phase, sometimes in the mapper. It >> seems to fail in multiple areas. Mostly in the reducers. Any ideas? >> >> Here's the settings I've changed: >> -Xmx400m >> 6 max mappers >> 1 max reducer >> 1GB swap partition >> mapred.job.reuse.jvm.num.tasks=50 >> mapred.reduce.parallel.copies=3 >> >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> at java.nio.CharBuffer.wrap(CharBuffer.java:350) >> at java.nio.CharBuffer.wrap(CharBuffer.java:373) >> at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) >> at java.lang.StringCoding.decode(StringCoding.java:173) >> at java.lang.String.(String.java:443) >> at java.lang.String.(String.java:515) >> at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) >> at cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) >> at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) >> at cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) >> at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) >> at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) >> at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) >> at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) >> at cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) >> at cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) >> at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) >> at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) >> at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) >> at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) >> at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) >> at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) >> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) >> at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) >> >> -- >> Bradford Stephens, >> Founder, Drawn to Scale >> drawntoscalehq.com >> 727.697.7528 >> >> http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> solution. Process, store, query, search, and serve all your data. >> >> http://www.roadtofailure.com -- The Fringes of Scalability, Social >> Media, and Computer Science >> > > > > -- > Bradford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > solution. Process, store, query, search, and serve all your data. > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededTed Yu 2010-09-26, 13:47
Have you tried lowering mapred.job.reuse.jvm.num.tasks ?
On Sun, Sep 26, 2010 at 3:30 AM, Bradford Stephens < [EMAIL PROTECTED]> wrote: > Nope, that didn't seem to help. > > On Sun, Sep 26, 2010 at 1:00 AM, Bradford Stephens > <[EMAIL PROTECTED]> wrote: > > I'm going to try running it on high-RAM boxes with -Xmx4096m or so, > > see if that helps. > > > > On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens > > <[EMAIL PROTECTED]> wrote: > >> Greetings, > >> > >> I'm running into a brain-numbing problem on Elastic MapReduce. I'm > >> running a decent-size task (22,000 mappers, a ton of GZipped input > >> blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). > >> > >> I get failures randomly --- sometimes at the end of my 6-step process, > >> sometimes at the first reducer phase, sometimes in the mapper. It > >> seems to fail in multiple areas. Mostly in the reducers. Any ideas? > >> > >> Here's the settings I've changed: > >> -Xmx400m > >> 6 max mappers > >> 1 max reducer > >> 1GB swap partition > >> mapred.job.reuse.jvm.num.tasks=50 > >> mapred.reduce.parallel.copies=3 > >> > >> > >> java.lang.OutOfMemoryError: GC overhead limit exceeded > >> at java.nio.CharBuffer.wrap(CharBuffer.java:350) > >> at java.nio.CharBuffer.wrap(CharBuffer.java:373) > >> at > java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) > >> at java.lang.StringCoding.decode(StringCoding.java:173) > >> at java.lang.String.(String.java:443) > >> at java.lang.String.(String.java:515) > >> at > org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) > >> at > cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) > >> at > cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) > >> at > cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) > >> at > cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) > >> at > cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) > >> at > cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) > >> at > cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) > >> at > cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) > >> at > cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) > >> at > org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) > >> at > org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) > >> at > org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) > >> at > org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) > >> at > org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > >> at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) > >> at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) > >> at > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) > >> > >> -- > >> Bradford Stephens, > >> Founder, Drawn to Scale > >> drawntoscalehq.com > >> 727.697.7528 > >> > >> http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > >> solution. Process, store, query, search, and serve all your data. > >> > >> http://www.roadtofailure.com -- The Fringes of Scalability, Social > >> Media, and Computer Science > >> > > > > > > > > -- > > Bradford Stephens, > > Founder, Drawn to Scale > > drawntoscalehq.com > > 727.697.7528 > > > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > > solution. Process, store, query, search, and serve all your data. > > > > http://www.roadtofailure.com -- The Fringes of Scalability, Social
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededChris K Wensel 2010-09-26, 15:10
fwiw
I run m2.xlarge slaves, using the default mappers/reducers (4/2 i think). with swap --bootstrap-action s3://elasticmapreduce/bootstrap-actions/create-swap-file.rb --args "-E,/mnt/swap,1000" historically i'v run this property with no issues, but should probably re-research the gc setting (comments please) "mapred.child.java.opts", "-server -Xmx2000m -XX:+UseParallelOldGC" i haven't co-installed ganglia to look at utilization lately, but any more mappers than 4 or more than 2 reducers have always given me headaches. ckw On Sep 26, 2010, at 12:55 AM, Bradford Stephens wrote: > Greetings, > > I'm running into a brain-numbing problem on Elastic MapReduce. I'm > running a decent-size task (22,000 mappers, a ton of GZipped input > blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). > > I get failures randomly --- sometimes at the end of my 6-step process, > sometimes at the first reducer phase, sometimes in the mapper. It > seems to fail in multiple areas. Mostly in the reducers. Any ideas? > > Here's the settings I've changed: > -Xmx400m > 6 max mappers > 1 max reducer > 1GB swap partition > mapred.job.reuse.jvm.num.tasks=50 > mapred.reduce.parallel.copies=3 > > > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.nio.CharBuffer.wrap(CharBuffer.java:350) > at java.nio.CharBuffer.wrap(CharBuffer.java:373) > at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) > at java.lang.StringCoding.decode(StringCoding.java:173) > at java.lang.String.(String.java:443) > at java.lang.String.(String.java:515) > at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) > at cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) > at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) > at cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) > at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) > at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) > at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) > at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) > at cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) > at cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) > at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) > at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) > at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) > at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) > at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) > > -- > Bradford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > solution. Process, store, query, search, and serve all your data. > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > > -- > You received this message because you are subscribed to the Google Groups "cascading-user" group. > To post to this group, send email to [EMAIL PROTECTED]. > To unsubscribe from this group, send email to cascading-user+[EMAIL PROTECTED]. > For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en. > -- Chris K Wensel [EMAIL PROTECTED] http://www.concurrentinc.com -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededTed Dunning 2010-09-26, 16:35
The old GC routinely gives me lower performance than modern GC. The default
is now quite good for batch programs. On Sun, Sep 26, 2010 at 8:10 AM, Chris K Wensel <[EMAIL PROTECTED]> wrote: > historically i'v run this property with no issues, but should probably > re-research the gc setting (comments please) > "mapred.child.java.opts", "-server -Xmx2000m -XX:+UseParallelOldGC" >
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededTed Dunning 2010-09-26, 16:37
My feeling is that you have some kind of leak going on in your mappers or
reducers and that reducing the number of times the jvm is re-used would improve matters. GC overhead limit indicates that your (tiny) heap is full and collection is not reducing that. On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < [EMAIL PROTECTED]> wrote: > mapred.job.reuse.jvm.num.tasks=50 >
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-26, 20:19
Hrm.... no. I've lowered it to -1, but I can try 1 again.
On Sun, Sep 26, 2010 at 6:47 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > Have you tried lowering mapred.job.reuse.jvm.num.tasks ? > > On Sun, Sep 26, 2010 at 3:30 AM, Bradford Stephens < > [EMAIL PROTECTED]> wrote: > >> Nope, that didn't seem to help. >> >> On Sun, Sep 26, 2010 at 1:00 AM, Bradford Stephens >> <[EMAIL PROTECTED]> wrote: >> > I'm going to try running it on high-RAM boxes with -Xmx4096m or so, >> > see if that helps. >> > >> > On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens >> > <[EMAIL PROTECTED]> wrote: >> >> Greetings, >> >> >> >> I'm running into a brain-numbing problem on Elastic MapReduce. I'm >> >> running a decent-size task (22,000 mappers, a ton of GZipped input >> >> blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). >> >> >> >> I get failures randomly --- sometimes at the end of my 6-step process, >> >> sometimes at the first reducer phase, sometimes in the mapper. It >> >> seems to fail in multiple areas. Mostly in the reducers. Any ideas? >> >> >> >> Here's the settings I've changed: >> >> -Xmx400m >> >> 6 max mappers >> >> 1 max reducer >> >> 1GB swap partition >> >> mapred.job.reuse.jvm.num.tasks=50 >> >> mapred.reduce.parallel.copies=3 >> >> >> >> >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at java.nio.CharBuffer.wrap(CharBuffer.java:350) >> >> at java.nio.CharBuffer.wrap(CharBuffer.java:373) >> >> at >> java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) >> >> at java.lang.StringCoding.decode(StringCoding.java:173) >> >> at java.lang.String.(String.java:443) >> >> at java.lang.String.(String.java:515) >> >> at >> org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) >> >> at >> cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) >> >> at >> cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) >> >> at >> cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) >> >> at >> cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) >> >> at >> cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) >> >> at >> cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) >> >> at >> cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) >> >> at >> cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) >> >> at >> cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) >> >> at >> org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) >> >> at >> org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) >> >> at >> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) >> >> at >> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) >> >> at >> org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) >> >> at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) >> >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) >> >> at >> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) >> >> >> >> -- >> >> Bradford Stephens, >> >> Founder, Drawn to Scale >> >> drawntoscalehq.com >> >> 727.697.7528 >> >> >> >> http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> >> solution. Process, store, query, search, and serve all your data. >> >> >> >> http://www.roadtofailure.com -- The Fringes of Scalability, Social >> >> Media, and Computer Science >> >> >> > >> > >> > >> > -- >> > Bradford Stephens, >> > Founder, Drawn to Scale Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededTed Yu 2010-09-26, 21:35
-1 means there is no limit to reusing.
At the same time, you can generate heap dump from OOME and analyze with YourKit, etc. Cheers On Sun, Sep 26, 2010 at 1:19 PM, Bradford Stephens < [EMAIL PROTECTED]> wrote: > Hrm.... no. I've lowered it to -1, but I can try 1 again. > > On Sun, Sep 26, 2010 at 6:47 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > Have you tried lowering mapred.job.reuse.jvm.num.tasks ? > > > > On Sun, Sep 26, 2010 at 3:30 AM, Bradford Stephens < > > [EMAIL PROTECTED]> wrote: > > > >> Nope, that didn't seem to help. > >> > >> On Sun, Sep 26, 2010 at 1:00 AM, Bradford Stephens > >> <[EMAIL PROTECTED]> wrote: > >> > I'm going to try running it on high-RAM boxes with -Xmx4096m or so, > >> > see if that helps. > >> > > >> > On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens > >> > <[EMAIL PROTECTED]> wrote: > >> >> Greetings, > >> >> > >> >> I'm running into a brain-numbing problem on Elastic MapReduce. I'm > >> >> running a decent-size task (22,000 mappers, a ton of GZipped input > >> >> blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). > >> >> > >> >> I get failures randomly --- sometimes at the end of my 6-step > process, > >> >> sometimes at the first reducer phase, sometimes in the mapper. It > >> >> seems to fail in multiple areas. Mostly in the reducers. Any ideas? > >> >> > >> >> Here's the settings I've changed: > >> >> -Xmx400m > >> >> 6 max mappers > >> >> 1 max reducer > >> >> 1GB swap partition > >> >> mapred.job.reuse.jvm.num.tasks=50 > >> >> mapred.reduce.parallel.copies=3 > >> >> > >> >> > >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded > >> >> at java.nio.CharBuffer.wrap(CharBuffer.java:350) > >> >> at java.nio.CharBuffer.wrap(CharBuffer.java:373) > >> >> at > >> java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) > >> >> at java.lang.StringCoding.decode(StringCoding.java:173) > >> >> at java.lang.String.(String.java:443) > >> >> at java.lang.String.(String.java:515) > >> >> at > >> org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) > >> >> at > >> cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) > >> >> at > >> cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) > >> >> at > >> > cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) > >> >> at > >> > cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) > >> >> at > >> > cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) > >> >> at > >> > cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) > >> >> at > >> > cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) > >> >> at > >> > cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) > >> >> at > >> > cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) > >> >> at > >> org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) > >> >> at > >> org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) > >> >> at > >> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) > >> >> at > >> > org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) > >> >> at > >> org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > >> >> at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) > >> >> at > >> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) > >> >> at > >> > org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) > >> >> > >> >> -- > >> >> Bradford Stephens, > >> >> Founder, Drawn to Scale
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-26, 23:46
Sadly, making Chris's changes didn't help.
Here's the Cascading code, it's pretty simple but uses the new "combiner"-like functionality: http://pastebin.com/ccvDmLSX On Sun, Sep 26, 2010 at 9:37 AM, Ted Dunning <[EMAIL PROTECTED]> wrote: > My feeling is that you have some kind of leak going on in your mappers or > reducers and that reducing the number of times the jvm is re-used would > improve matters. > > GC overhead limit indicates that your (tiny) heap is full and collection is > not reducing that. > > On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < > [EMAIL PROTECTED]> wrote: > >> mapred.job.reuse.jvm.num.tasks=50 >> > -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededChris K Wensel 2010-09-27, 00:09
Try using a lower threshold value (the num of values in the LRU to cache). this is the tradeoff of this approach.
ckw On Sep 26, 2010, at 4:46 PM, Bradford Stephens wrote: > Sadly, making Chris's changes didn't help. > > Here's the Cascading code, it's pretty simple but uses the new > "combiner"-like functionality: > > http://pastebin.com/ccvDmLSX > > > > On Sun, Sep 26, 2010 at 9:37 AM, Ted Dunning <[EMAIL PROTECTED]> wrote: >> My feeling is that you have some kind of leak going on in your mappers or >> reducers and that reducing the number of times the jvm is re-used would >> improve matters. >> >> GC overhead limit indicates that your (tiny) heap is full and collection is >> not reducing that. >> >> On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < >> [EMAIL PROTECTED]> wrote: >> >>> mapred.job.reuse.jvm.num.tasks=50 >>> >> > > > > -- > Bradford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > solution. Process, store, query, search, and serve all your data. > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > > -- > You received this message because you are subscribed to the Google Groups "cascading-user" group. > To post to this group, send email to [EMAIL PROTECTED]. > To unsubscribe from this group, send email to cascading-user+[EMAIL PROTECTED]. > For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en. > -- Chris K Wensel [EMAIL PROTECTED] http://www.concurrentinc.com -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-27, 00:37
Yup, I've turned it down to 1,000. Let's see if that helps!
On Sun, Sep 26, 2010 at 5:09 PM, Chris K Wensel <[EMAIL PROTECTED]> wrote: > Try using a lower threshold value (the num of values in the LRU to cache). this is the tradeoff of this approach. > > ckw > > On Sep 26, 2010, at 4:46 PM, Bradford Stephens wrote: > >> Sadly, making Chris's changes didn't help. >> >> Here's the Cascading code, it's pretty simple but uses the new >> "combiner"-like functionality: >> >> http://pastebin.com/ccvDmLSX >> >> >> >> On Sun, Sep 26, 2010 at 9:37 AM, Ted Dunning <[EMAIL PROTECTED]> wrote: >>> My feeling is that you have some kind of leak going on in your mappers or >>> reducers and that reducing the number of times the jvm is re-used would >>> improve matters. >>> >>> GC overhead limit indicates that your (tiny) heap is full and collection is >>> not reducing that. >>> >>> On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < >>> [EMAIL PROTECTED]> wrote: >>> >>>> mapred.job.reuse.jvm.num.tasks=50 >>>> >>> >> >> >> >> -- >> Bradford Stephens, >> Founder, Drawn to Scale >> drawntoscalehq.com >> 727.697.7528 >> >> http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> solution. Process, store, query, search, and serve all your data. >> >> http://www.roadtofailure.com -- The Fringes of Scalability, Social >> Media, and Computer Science >> >> -- >> You received this message because you are subscribed to the Google Groups "cascading-user" group. >> To post to this group, send email to [EMAIL PROTECTED]. >> To unsubscribe from this group, send email to cascading-user+[EMAIL PROTECTED]. >> For more options, visit this group at http://groups.google.com/group/cascading-user?hl=en. >> > > -- > Chris K Wensel > [EMAIL PROTECTED] > http://www.concurrentinc.com > > -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading > > -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededAlex Kozlov 2010-09-27, 00:41
Hi Bradford,
Sometimes the reducers do not handle merging large chunks of data too well: How many reducers do you have? Try to increase the # of reducers (you can always merge the data later if you are worried about too many output files). -- Alex Kozlov Solutions Architect Cloudera, Inc twitter: alexvk2009 Hadoop World 2010, October 12, New York City - Register now: http://www.cloudera.com/company/press-center/hadoop-world-nyc/ On Sun, Sep 26, 2010 at 5:09 PM, Chris K Wensel <[EMAIL PROTECTED]> wrote: > Try using a lower threshold value (the num of values in the LRU to cache). > this is the tradeoff of this approach. > > ckw > > On Sep 26, 2010, at 4:46 PM, Bradford Stephens wrote: > > > Sadly, making Chris's changes didn't help. > > > > Here's the Cascading code, it's pretty simple but uses the new > > "combiner"-like functionality: > > > > http://pastebin.com/ccvDmLSX > > > > > > > > On Sun, Sep 26, 2010 at 9:37 AM, Ted Dunning <[EMAIL PROTECTED]> > wrote: > >> My feeling is that you have some kind of leak going on in your mappers > or > >> reducers and that reducing the number of times the jvm is re-used would > >> improve matters. > >> > >> GC overhead limit indicates that your (tiny) heap is full and collection > is > >> not reducing that. > >> > >> On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < > >> [EMAIL PROTECTED]> wrote: > >> > >>> mapred.job.reuse.jvm.num.tasks=50 > >>> > >> > > > > > > > > -- > > Bradford Stephens, > > Founder, Drawn to Scale > > drawntoscalehq.com > > 727.697.7528 > > > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > > solution. Process, store, query, search, and serve all your data. > > > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > > Media, and Computer Science > > > > -- > > You received this message because you are subscribed to the Google Groups > "cascading-user" group. > > To post to this group, send email to [EMAIL PROTECTED]. > > To unsubscribe from this group, send email to > cascading-user+[EMAIL PROTECTED]<cascading-user%[EMAIL PROTECTED]> > . > > For more options, visit this group at > http://groups.google.com/group/cascading-user?hl=en. > > > > -- > Chris K Wensel > [EMAIL PROTECTED] > http://www.concurrentinc.com > > -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading > >
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-27, 01:01
One of the problems with this data set is that I'm grouping by a
category that has only, say, 20 different values. Then I'm doing a unique count of Facebook user IDs per group. I imagine that's not pleasant for the reducers. On Sun, Sep 26, 2010 at 5:41 PM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > Hi Bradford, > > Sometimes the reducers do not handle merging large chunks of data too well: > How many reducers do you have? Try to increase the # of reducers (you can > always merge the data later if you are worried about too many output files). > > -- > Alex Kozlov > Solutions Architect > Cloudera, Inc > twitter: alexvk2009 > > Hadoop World 2010, October 12, New York City - Register now: > http://www.cloudera.com/company/press-center/hadoop-world-nyc/ > > > On Sun, Sep 26, 2010 at 5:09 PM, Chris K Wensel <[EMAIL PROTECTED]> wrote: > >> Try using a lower threshold value (the num of values in the LRU to cache). >> this is the tradeoff of this approach. >> >> ckw >> >> On Sep 26, 2010, at 4:46 PM, Bradford Stephens wrote: >> >> > Sadly, making Chris's changes didn't help. >> > >> > Here's the Cascading code, it's pretty simple but uses the new >> > "combiner"-like functionality: >> > >> > http://pastebin.com/ccvDmLSX >> > >> > >> > >> > On Sun, Sep 26, 2010 at 9:37 AM, Ted Dunning <[EMAIL PROTECTED]> >> wrote: >> >> My feeling is that you have some kind of leak going on in your mappers >> or >> >> reducers and that reducing the number of times the jvm is re-used would >> >> improve matters. >> >> >> >> GC overhead limit indicates that your (tiny) heap is full and collection >> is >> >> not reducing that. >> >> >> >> On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < >> >> [EMAIL PROTECTED]> wrote: >> >> >> >>> mapred.job.reuse.jvm.num.tasks=50 >> >>> >> >> >> > >> > >> > >> > -- >> > Bradford Stephens, >> > Founder, Drawn to Scale >> > drawntoscalehq.com >> > 727.697.7528 >> > >> > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> > solution. Process, store, query, search, and serve all your data. >> > >> > http://www.roadtofailure.com -- The Fringes of Scalability, Social >> > Media, and Computer Science >> > >> > -- >> > You received this message because you are subscribed to the Google Groups >> "cascading-user" group. >> > To post to this group, send email to [EMAIL PROTECTED]. >> > To unsubscribe from this group, send email to >> cascading-user+[EMAIL PROTECTED]<cascading-user%[EMAIL PROTECTED]> >> . >> > For more options, visit this group at >> http://groups.google.com/group/cascading-user?hl=en. >> > >> >> -- >> Chris K Wensel >> [EMAIL PROTECTED] >> http://www.concurrentinc.com >> >> -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading >> >> > -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededTed Dunning 2010-09-27, 02:00
If there are combiners, the reducers shouldn't get any lists longer than a
small multiple of the number of maps. On Sun, Sep 26, 2010 at 6:01 PM, Bradford Stephens < [EMAIL PROTECTED]> wrote: > One of the problems with this data set is that I'm grouping by a > category that has only, say, 20 different values. Then I'm doing a > unique count of Facebook user IDs per group. I imagine that's not > pleasant for the reducers. > > On Sun, Sep 26, 2010 at 5:41 PM, Alex Kozlov <[EMAIL PROTECTED]> wrote: > > Hi Bradford, > > > > Sometimes the reducers do not handle merging large chunks of data too > well: > > How many reducers do you have? Try to increase the # of reducers (you > can > > always merge the data later if you are worried about too many output > files). > > > > -- > > Alex Kozlov > > Solutions Architect > > Cloudera, Inc > > twitter: alexvk2009 > > > > Hadoop World 2010, October 12, New York City - Register now: > > http://www.cloudera.com/company/press-center/hadoop-world-nyc/ > > > > > > On Sun, Sep 26, 2010 at 5:09 PM, Chris K Wensel <[EMAIL PROTECTED]> > wrote: > > > >> Try using a lower threshold value (the num of values in the LRU to > cache). > >> this is the tradeoff of this approach. > >> > >> ckw > >> > >> On Sep 26, 2010, at 4:46 PM, Bradford Stephens wrote: > >> > >> > Sadly, making Chris's changes didn't help. > >> > > >> > Here's the Cascading code, it's pretty simple but uses the new > >> > "combiner"-like functionality: > >> > > >> > http://pastebin.com/ccvDmLSX > >> > > >> > > >> > > >> > On Sun, Sep 26, 2010 at 9:37 AM, Ted Dunning <[EMAIL PROTECTED]> > >> wrote: > >> >> My feeling is that you have some kind of leak going on in your > mappers > >> or > >> >> reducers and that reducing the number of times the jvm is re-used > would > >> >> improve matters. > >> >> > >> >> GC overhead limit indicates that your (tiny) heap is full and > collection > >> is > >> >> not reducing that. > >> >> > >> >> On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < > >> >> [EMAIL PROTECTED]> wrote: > >> >> > >> >>> mapred.job.reuse.jvm.num.tasks=50 > >> >>> > >> >> > >> > > >> > > >> > > >> > -- > >> > Bradford Stephens, > >> > Founder, Drawn to Scale > >> > drawntoscalehq.com > >> > 727.697.7528 > >> > > >> > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > >> > solution. Process, store, query, search, and serve all your data. > >> > > >> > http://www.roadtofailure.com -- The Fringes of Scalability, Social > >> > Media, and Computer Science > >> > > >> > -- > >> > You received this message because you are subscribed to the Google > Groups > >> "cascading-user" group. > >> > To post to this group, send email to [EMAIL PROTECTED]. > >> > To unsubscribe from this group, send email to > >> cascading-user+[EMAIL PROTECTED]<cascading-user%[EMAIL PROTECTED]> > <cascading-user%[EMAIL PROTECTED]<cascading-user%[EMAIL PROTECTED]> > > > >> . > >> > For more options, visit this group at > >> http://groups.google.com/group/cascading-user?hl=en. > >> > > >> > >> -- > >> Chris K Wensel > >> [EMAIL PROTECTED] > >> http://www.concurrentinc.com > >> > >> -- Concurrent, Inc. offers mentoring, support, and licensing for > Cascading > >> > >> > > > > > > -- > Bradford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data > solution. Process, store, query, search, and serve all your data. > > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > > -- > You received this message because you are subscribed to the Google Groups > "cascading-user" group. > To post to this group, send email to [EMAIL PROTECTED]. > To unsubscribe from this group, send email to > cascading-user+[EMAIL PROTECTED]<cascading-user%[EMAIL PROTECTED]> > . > For more options, visit this group at
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededVitaliy Semochkin 2010-09-27, 09:20
Hi,
"[..]if more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, an OutOfMemoryError will be thrown. This feature is designed to prevent applications from running for an extended period of time while making little or no progress because the heap is too small. If necessary, this feature can be disabled by adding the option -XX:-UseGCOverheadLimit to the command line." This is what often happens in MapReduce operations when u process a lot of data. I recommend to try <property> <name>mapred.child.java.opts</name> <value>-Xmx1024m -XX:-UseGCOverheadLimit</value> </property> also from my personal experience when process a lot of data often it is much cheaper to kill JVM rather than wait for GC. For that reason if you have a lot of BIG tasks rather than tons of small tasks do not reuse JVM, killing JVM and starting it again often much cheaper than trying to GC 1GB of ram(don't know why, it just tuned out in my tests). <property> <name>mapred.job.reuse.jvm.num.tasks</name> <value>1</value> </description> Regards, Vitaliy S On Sun, Sep 26, 2010 at 11:55 AM, Bradford Stephens <[EMAIL PROTECTED]> wrote: > Greetings, > > I'm running into a brain-numbing problem on Elastic MapReduce. I'm > running a decent-size task (22,000 mappers, a ton of GZipped input > blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). > > I get failures randomly --- sometimes at the end of my 6-step process, > sometimes at the first reducer phase, sometimes in the mapper. It > seems to fail in multiple areas. Mostly in the reducers. Any ideas? > > Here's the settings I've changed: > -Xmx400m > 6 max mappers > 1 max reducer > 1GB swap partition > mapred.job.reuse.jvm.num.tasks=50 > mapred.reduce.parallel.copies=3 > > > java.lang.OutOfMemoryError: GC overhead limit exceeded > at java.nio.CharBuffer.wrap(CharBuffer.java:350) > at java.nio.CharBuffer.wrap(CharBuffer.java:373) > at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) > at java.lang.StringCoding.decode(StringCoding.java:173) > at java.lang.String.(String.java:443) > at java.lang.String.(String.java:515) > at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) > at cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) > at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) > at cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) > at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) > at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) > at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) > at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) > at cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) > at cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) > at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) > at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) > at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) > at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) > at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) > at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) > at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) > > -- > Bradford Stephens, > Founder, Drawn to Scale > drawntoscalehq.com > 727.697.7528 > > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBradford Stephens 2010-09-27, 09:46
It turned out to be a deployment issue of an old version. Ted and
Chris's suggestions were spot-on. I can't believe how BRILLIANT these combiners from Cascading are. It's cut my processing time down from 20 hours to 50 minutes. AND I cut out about 80% of my hand-crafted code. Bravo. I look smart now. (Almost). -B On Sun, Sep 26, 2010 at 7:00 PM, Ted Dunning <[EMAIL PROTECTED]> wrote: > If there are combiners, the reducers shouldn't get any lists longer than a > small multiple of the number of maps. > > On Sun, Sep 26, 2010 at 6:01 PM, Bradford Stephens < > [EMAIL PROTECTED]> wrote: > >> One of the problems with this data set is that I'm grouping by a >> category that has only, say, 20 different values. Then I'm doing a >> unique count of Facebook user IDs per group. I imagine that's not >> pleasant for the reducers. >> >> On Sun, Sep 26, 2010 at 5:41 PM, Alex Kozlov <[EMAIL PROTECTED]> wrote: >> > Hi Bradford, >> > >> > Sometimes the reducers do not handle merging large chunks of data too >> well: >> > How many reducers do you have? Try to increase the # of reducers (you >> can >> > always merge the data later if you are worried about too many output >> files). >> > >> > -- >> > Alex Kozlov >> > Solutions Architect >> > Cloudera, Inc >> > twitter: alexvk2009 >> > >> > Hadoop World 2010, October 12, New York City - Register now: >> > http://www.cloudera.com/company/press-center/hadoop-world-nyc/ >> > >> > >> > On Sun, Sep 26, 2010 at 5:09 PM, Chris K Wensel <[EMAIL PROTECTED]> >> wrote: >> > >> >> Try using a lower threshold value (the num of values in the LRU to >> cache). >> >> this is the tradeoff of this approach. >> >> >> >> ckw >> >> >> >> On Sep 26, 2010, at 4:46 PM, Bradford Stephens wrote: >> >> >> >> > Sadly, making Chris's changes didn't help. >> >> > >> >> > Here's the Cascading code, it's pretty simple but uses the new >> >> > "combiner"-like functionality: >> >> > >> >> > http://pastebin.com/ccvDmLSX >> >> > >> >> > >> >> > >> >> > On Sun, Sep 26, 2010 at 9:37 AM, Ted Dunning <[EMAIL PROTECTED]> >> >> wrote: >> >> >> My feeling is that you have some kind of leak going on in your >> mappers >> >> or >> >> >> reducers and that reducing the number of times the jvm is re-used >> would >> >> >> improve matters. >> >> >> >> >> >> GC overhead limit indicates that your (tiny) heap is full and >> collection >> >> is >> >> >> not reducing that. >> >> >> >> >> >> On Sun, Sep 26, 2010 at 12:55 AM, Bradford Stephens < >> >> >> [EMAIL PROTECTED]> wrote: >> >> >> >> >> >>> mapred.job.reuse.jvm.num.tasks=50 >> >> >>> >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Bradford Stephens, >> >> > Founder, Drawn to Scale >> >> > drawntoscalehq.com >> >> > 727.697.7528 >> >> > >> >> > http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> >> > solution. Process, store, query, search, and serve all your data. >> >> > >> >> > http://www.roadtofailure.com -- The Fringes of Scalability, Social >> >> > Media, and Computer Science >> >> > >> >> > -- >> >> > You received this message because you are subscribed to the Google >> Groups >> >> "cascading-user" group. >> >> > To post to this group, send email to [EMAIL PROTECTED]. >> >> > To unsubscribe from this group, send email to >> >> cascading-user+[EMAIL PROTECTED]<cascading-user%[EMAIL PROTECTED]> >> <cascading-user%[EMAIL PROTECTED]<cascading-user%[EMAIL PROTECTED]> >> > >> >> . >> >> > For more options, visit this group at >> >> http://groups.google.com/group/cascading-user?hl=en. >> >> > >> >> >> >> -- >> >> Chris K Wensel >> >> [EMAIL PROTECTED] >> >> http://www.concurrentinc.com >> >> >> >> -- Concurrent, Inc. offers mentoring, support, and licensing for >> Cascading >> >> >> >> >> > >> >> >> >> -- >> Bradford Stephens, >> Founder, Drawn to Scale >> drawntoscalehq.com >> 727.697.7528 >> >> http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> solution. Process, store, query, search, and serve all your data. Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
-
Re: java.lang.OutOfMemoryError: GC overhead limit exceededBharath Mundlapudi 2010-09-27, 18:24
Couple of things you can try.
1. Increase the Heap Size for the tasks. 2. Since, your OOM happening randomly, try setting -XX:+HeapDumpOnOutOfMemoryError for your child JVM parameters. Atleast you can detect, why your heap growing -is it due to a leak ? or if you need to increase the heap size for your mappers or reduces from this heap dump analysis. 3. Other reason is due to poor JVM GC tuning. Sometimes, default can't catchup with the garbage created. This needs some GC tuning. -Bharath From: [EMAIL PROTECTED] To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Cc: Sent: Sunday, September 26, 2010 12:55:15 AM Subject: java.lang.OutOfMemoryError: GC overhead limit exceeded Greetings, I'm running into a brain-numbing problem on Elastic MapReduce. I'm running a decent-size task (22,000 mappers, a ton of GZipped input blocks, ~1TB of data) on 40 c1.xlarge nodes (7 gb RAM, ~8 "cores"). I get failures randomly --- sometimes at the end of my 6-step process, sometimes at the first reducer phase, sometimes in the mapper. It seems to fail in multiple areas. Mostly in the reducers. Any ideas? Here's the settings I've changed: -Xmx400m 6 max mappers 1 max reducer 1GB swap partition mapred.job.reuse.jvm.num.tasks=50 mapred.reduce.parallel.copies=3 java.lang.OutOfMemoryError: GC overhead limit exceeded at java.nio.CharBuffer.wrap(CharBuffer.java:350) at java.nio.CharBuffer.wrap(CharBuffer.java:373) at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:138) at java.lang.StringCoding.decode(StringCoding.java:173) at java.lang.String.(String.java:443) at java.lang.String.(String.java:515) at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:116) at cascading.tuple.TupleInputStream.readString(TupleInputStream.java:144) at cascading.tuple.TupleInputStream.readType(TupleInputStream.java:154) at cascading.tuple.TupleInputStream.getNextElement(TupleInputStream.java:101) at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:75) at cascading.tuple.hadoop.TupleElementComparator.compare(TupleElementComparator.java:33) at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:74) at cascading.tuple.hadoop.DelegatingTupleElementComparator.compare(DelegatingTupleElementComparator.java:34) at cascading.tuple.hadoop.DeserializerComparator.compareTuples(DeserializerComparator.java:142) at cascading.tuple.hadoop.GroupingSortingComparator.compare(GroupingSortingComparator.java:55) at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373) at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:136) at org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103) at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335) at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) at org.apache.hadoop.mapred.Merger.writeFile(Merger.java:156) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.doInMemMerge(ReduceTask.java:2645) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:2586) -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science |