Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> java.lang.OutOfMemoryError: Java heap space


Copy link to this message
-
java.lang.OutOfMemoryError: Java heap space
Hi all,

I am running a Pig Script which is running fine for small data. But when I
scale the data, I am getting the following error at my map stage.
Please refer to the map logs as below.

My Pig script is doing a group by first, followed by a join on the grouped
data.
Any clues to understand where I should look at or how shall I deal with
this situation. I don't want to just go by just increasing the heap space.
My map jvm heap space is already 3 GB with io.sort.mb = 768 MB.

2014-02-06 19:15:12,243 WARN org.apache.hadoop.util.NativeCodeLoader:
Unable to load native-hadoop library for your platform... using
builtin-java classes where applicable 2014-02-06 19:15:15,025 INFO
org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2014-02-06 19:15:15,123 INFO org.apache.hadoop.mapred.Task: Using
ResourceCalculatorPlugin :
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@2bd9e282 2014-02-06
19:15:15,546 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 768
2014-02-06 19:15:19,846 INFO org.apache.hadoop.mapred.MapTask: data buffer
= 612032832/644245088 2014-02-06 19:15:19,846 INFO
org.apache.hadoop.mapred.MapTask: record buffer = 9563013/10066330
2014-02-06 19:15:20,037 INFO org.apache.hadoop.io.compress.CodecPool: Got
brand-new decompressor 2014-02-06 19:15:21,083 INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader:
Created input record counter: Input records from _1_tmp1327641329
2014-02-06 19:15:52,894 INFO org.apache.hadoop.mapred.MapTask: Spilling map
output: buffer full= true 2014-02-06 19:15:52,895 INFO
org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 611949600; bufvoid
= 644245088 2014-02-06 19:15:52,895 INFO org.apache.hadoop.mapred.MapTask:
kvstart = 0; kvend = 576; length = 10066330 2014-02-06 19:16:06,182 INFO
org.apache.hadoop.mapred.MapTask: Finished spill 0 2014-02-06 19:16:16,169
INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler
call - Collection threshold init = 328728576(321024K) used =
1175055104(1147514K) committed = 1770848256(1729344K) max =
2097152000(2048000K) 2014-02-06 19:16:20,446 INFO
org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate of
308540402 bytes from 1 objects. init = 328728576(321024K) used =
1175055104(1147514K) committed = 1770848256(1729344K) max =
2097152000(2048000K) 2014-02-06 19:17:22,246 INFO
org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call-
Usage threshold init = 328728576(321024K) used = 1768466512(1727018K)
committed = 1770848256(1729344K) max = 2097152000(2048000K) 2014-02-06
19:17:35,597 INFO org.apache.pig.impl.util.SpillableMemoryManager: Spilled
an estimate of 1073462600 bytes from 1 objects. init = 328728576(321024K)
used = 1768466512(1727018K) committed = 1770848256(1729344K) max =
2097152000(2048000K) 2014-02-06 19:18:01,276 INFO
org.apache.hadoop.mapred.MapTask: Spilling map output: buffer full= true
2014-02-06 19:18:01,288 INFO org.apache.hadoop.mapred.MapTask: bufstart =
611949600; bufend = 52332788; bufvoid = 644245088 2014-02-06 19:18:01,288
INFO org.apache.hadoop.mapred.MapTask: kvstart = 576; kvend = 777; length =
10066330 2014-02-06 19:18:03,377 INFO org.apache.hadoop.mapred.MapTask:
Finished spill 1 2014-02-06 19:18:05,494 INFO
org.apache.hadoop.mapred.MapTask: Record too large for in-memory buffer:
644246693 bytes 2014-02-06 19:18:36,008 INFO
org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate of
306271368 bytes from 1 objects. init = 328728576(321024K) used =
1449267128(1415299K) committed = 2097152000(2048000K) max =
2097152000(2048000K) 2014-02-06 19:18:44,448 INFO
org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater
with mapRetainSize=-1 and reduceRetainSize=-1 2014-02-06 19:18:44,780 FATAL
org.apache.hadoop.mapred.Child: Error running child :
java.lang.OutOfMemoryError: Java heap space at
java.util.Arrays.copyOf(Arrays.java:2786) at
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94) at
java.io.DataOutputStream.write(DataOutputStream.java:90) at
java.io.DataOutputStream.writeUTF(DataOutputStream.java:384) at
java.io.DataOutputStream.writeUTF(DataOutputStream.java:306) at
org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:454) at
org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:542) at
org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:523) at
org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361) at
org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:542) at
org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357) at
org.apache.pig.data.BinSedesTuple.write(BinSedesTuple.java:57) at
org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:123)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:179) at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.spillSingleRecord(MapTask.java:1501)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1091)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:128)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:269)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:76
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB