|
|
-
Re: Killed : GC overhead limit exceededTed Yu 2010-07-18, 19:06
That's what I suggested.
You can actually declare tabPattern, etc static variables of your Mapper class. You can also lower -Xmx to give other processes on the same node more memory. Cheers On Sun, Jul 18, 2010 at 4:11 AM, Alan Miller <[EMAIL PROTECTED]>wrote: > Thanks Ted, > > One or both suggestions remedied the problem. I'm not seeing that error > anymore. > > In my Driver class I used config.set("mapred.child.java.opts", "-Xmx2048m > -Xincgc"); > But I also altered my mapred-site.xml and set: > io.file.buffer.size 65536 > io.sort.factor 32 > io.sort.mb 320 > > For the 2nd suggestion. I'm a java novice, so I'm not sure if this actually > does what you intended: > > I moved the 3 Patterns outside my map() and changed the logic to this: > > public class MyMapper extends Mapper<Object, Text, Text, Text) { > > Pattern tabPattern = Pattern.compile("\t"); > Pattern eolPattern = Pattern.compile("\n"); > Pattern spacePattern = Pattern.compile("(^[\\s]*)|([\\s]$)"); > > public void map(Object key, Text value, Context context) { > for (String line : eolPattern.split(value.toString()) { > .... > String[] values = tabPattern.split(line); > > for (int i=0; i,values.length; i++) { > values[i] = spacePattern.matcher(values[i]).replaceAll(""); > } > parser.setvals(values); > .... > } > } > } > > Alan > > > On 07/17/2010 07:28 AM, Ted Yu wrote: > >> Have you tried increasing memory beyond 1GB for your map task ? >> >> I think you have noticed that both OOME came from Pattern.compile(). >> >> Please take a look at >> http://www.docjar.com/html/api/java/lang/String.java.html >> >> I would suggest pre-compiling the three patterns when setting up your >> mapper >> - basically write your own split() and replaceAll(). >> >> I recently did something similar. You can find out the performance >> improvement by customization - >> https://issues.apache.org/jira/browse/MAPREDUCE-1946 >> >> Cheers >> >> On Fri, Jul 16, 2010 at 6:06 AM, Some Body<[EMAIL PROTECTED]> >> wrote: >> >> >> >>> Guess attachments are stripped. >>> >>> Here's the memory graph: http://tinyurl.com/37g3hmu >>> Here's the VM Summary: http://tinyurl.com/36wqzjq >>> >>> Alan >>> >>> >>> >> >> > > |