Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Killed : GC overhead limit exceeded


+
Some Body 2010-07-16, 12:12
+
Some Body 2010-07-16, 12:40
+
Some Body 2010-07-16, 13:06
+
Ted Yu 2010-07-17, 05:28
+
Alan Miller 2010-07-18, 11:11
Copy link to this message
-
Re: Killed : GC overhead limit exceeded
That's what I suggested.
You can actually declare tabPattern, etc static variables of your Mapper
class.
You can also lower -Xmx to give other processes on the same node more
memory.

Cheers

On Sun, Jul 18, 2010 at 4:11 AM, Alan Miller <[EMAIL PROTECTED]>wrote:

> Thanks Ted,
>
> One or both suggestions remedied the problem.  I'm not seeing that error
> anymore.
>
> In my Driver class I used config.set("mapred.child.java.opts", "-Xmx2048m
> -Xincgc");
> But I also altered my mapred-site.xml and set:
>    io.file.buffer.size 65536
>    io.sort.factor 32
>    io.sort.mb 320
>
> For the 2nd suggestion. I'm a java novice, so I'm not sure if this actually
> does what you intended:
>
> I moved the 3 Patterns outside my map() and changed the logic to this:
>
> public class MyMapper extends Mapper<Object, Text, Text, Text) {
>
>  Pattern tabPattern = Pattern.compile("\t");
>  Pattern eolPattern = Pattern.compile("\n");
>  Pattern spacePattern = Pattern.compile("(^[\\s]*)|([\\s]$)");
>
>  public void map(Object key, Text value, Context context) {
>      for (String line : eolPattern.split(value.toString()) {
>        ....
>        String[] values = tabPattern.split(line);
>
>        for (int i=0; i,values.length; i++) {
>            values[i] = spacePattern.matcher(values[i]).replaceAll("");
>        }
>        parser.setvals(values);
>        ....
>    }
>  }
> }
>
> Alan
>
>
> On 07/17/2010 07:28 AM, Ted Yu wrote:
>
>> Have you tried increasing memory beyond 1GB for your map task ?
>>
>> I think you have noticed that both OOME came from Pattern.compile().
>>
>> Please take a look at
>> http://www.docjar.com/html/api/java/lang/String.java.html
>>
>> I would suggest pre-compiling the three patterns when setting up your
>> mapper
>> - basically write your own split() and replaceAll().
>>
>> I recently did something similar. You can find out the performance
>> improvement by customization -
>> https://issues.apache.org/jira/browse/MAPREDUCE-1946
>>
>> Cheers
>>
>> On Fri, Jul 16, 2010 at 6:06 AM, Some Body<[EMAIL PROTECTED]>
>>  wrote:
>>
>>
>>
>>> Guess attachments are stripped.
>>>
>>> Here's the memory graph:   http://tinyurl.com/37g3hmu
>>> Here's the VM Summary:   http://tinyurl.com/36wqzjq
>>>
>>> Alan
>>>
>>>
>>>
>>
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB