Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> OutOfMemoryError of PIG job (UDF loads big file)


Copy link to this message
-
Re: OutOfMemoryError of PIG job (UDF loads big file)
Hi Jiang,

you should set property *mapred.child.java.opts* in mapred-site.xml to
increase the memeory
as following:

 <property>
         <name>mapred.child.java.opts</name>
         <value>-Xmx1024m</value>
 </property>

 and then restart your hadoop cluster
On Tue, Feb 23, 2010 at 9:43 AM, jiang licht <[EMAIL PROTECTED]> wrote:

> I am running a hadoop job written in PIG. It fails from out of memory
> because a UDF function consumes a lot of memory, it loads a big file. What
> are the settings to avoid the following OutOfMemoryError? I guess by simply
> giving PIG big memory (java -XmxBIGmemory org.apache.pig.Main ...) won't
> work.
>
> Error message --->
>
> java.lang.OutOfMemoryError: Java heap space
>        at java.util.regex.Pattern.compile(Pattern.java:1451)
>        at java.util.regex.Pattern.(Pattern.java:1133)
>        at java.util.regex.Pattern.compile(Pattern.java:823)
>        at java.lang.String.split(String.java:2293)
>        at java.lang.String.split(String.java:2335)
>        at UDF.load(Unknown Source)
>        at UDF.load(Unknown Source)
>        at UDF.exec(Unknown Source)
>        at UDF.exec(Unknown Source)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:287)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:278)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>        at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
>        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
>        at org.apache.hadoop.mapred.Child.main(Child.java:155)
>
> Thanks!
> Michael
>
>
>
>
--
Best Regards

Jeff Zhang
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB