Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> need compress for ObjectSerializer

Copy link to this message
need compress for ObjectSerializer
hi, all
I think we need to optimize the org.apache.pig.impl.util.ObjectSerializer, because it uses java object serialization, which wastes a lot of space, so that it causes the tasktracker to OOME. here's the analyze result of tasktracker heap dump:

This illustrates that the heap is retained by the JobConf objects, and we known jobconf contains a lot of Key-value strings.

So here's the statistics of heap retention:

And dive into the object histogram, here it is:

And here's the source code:

So I think we need to compress the output of object serializer. I'm submitting my patch.

Haitao Yao
weibo: @haitao_yao
Skype:  haitao.yao.final

Roh 2012-11-06, 04:16
Haitao Yao 2012-11-06, 04:35