Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> workaround for  java.lang.OutOfMemoryError: Java heap space?


Copy link to this message
-
workaround for  java.lang.OutOfMemoryError: Java heap space?
I have a pig script that is working well for small test data sets but fails on a run over realistic-sized data. Logs show
  INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201106061024_0331 has failed!
  …
  job_201106061024_0331   CitedItemsGrpByDocId,DedupTCPerDocId    GROUP_BY,COMBINER       Message: Job failed!
  …
 attempt_201106061024_0331_m_000198_0  […]   Error: java.lang.OutOfMemoryError: Java heap space
  and similar same for all attempts at a few of the other (many) map tasks for this job.

I believe  this job corresponds to these lines in my pig script:

 CitedItemsGrpByDocId = group CitedItems by citeddocid;
 DedupTCPerDocId      foreach CitedItemsGrpByDocId {
         CitingDocids =  CitedItems.citingdocid;
         UniqCitingDocids = distinct CitingDocids;
         generate group, COUNT(UniqCitingDocids) as tc;
      };

I tried increasing mapred.child.java.opts but the job failed in a setup stage with
  Error occurred during initialization of VM
  Could not reserve enough space for object heap

Are there job configurations/parameters for Hadoop or pig I can set to get around this? Is there a Pig Latin circumlocution, or better way to express what I want, that is not as memory-hungry?

Thank in advance,

Will

William F Dowling
Sr Technical Specialist, Software Engineering
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB