|
|
-
need compress for ObjectSerializer
Haitao Yao 2012-11-06, 04:06
hi, all I think we need to optimize the org.apache.pig.impl.util.ObjectSerializer, because it uses java object serialization, which wastes a lot of space, so that it causes the tasktracker to OOME. here's the analyze result of tasktracker heap dump:
This illustrates that the heap is retained by the JobConf objects, and we known jobconf contains a lot of Key-value strings.
So here's the statistics of heap retention:
And dive into the object histogram, here it is:
And here's the source code:
So I think we need to compress the output of object serializer. I'm submitting my patch.
Haitao Yao [EMAIL PROTECTED] weibo: @haitao_yao Skype: haitao.yao.final
+
Haitao Yao 2012-11-06, 04:06
-
Re: need compress for ObjectSerializer
Roh 2012-11-06, 04:16
It is already fixed as part of PIG-3017
Sent from my iPad
On Nov 5, 2012, at 8:06 PM, Haitao Yao <[EMAIL PROTECTED]> wrote:
> hi, all > I think we need to optimize the org.apache.pig.impl.util.ObjectSerializer, because it uses java object serialization, which wastes a lot of space, so that it causes the tasktracker to OOME. here's the analyze result of tasktracker heap dump: > <aa.jpg> > This illustrates that the heap is retained by the JobConf objects, and we known jobconf contains a lot of Key-value strings. > > > > So here's the statistics of heap retention: > <bb.jpg> > > > And dive into the object histogram, here it is: > <cc.jpg> > > > And here's the source code: > <dd.jpg> > > > So I think we need to compress the output of object serializer. I'm submitting my patch. > > > > > > Haitao Yao > [EMAIL PROTECTED] > weibo: @haitao_yao > Skype: haitao.yao.final >
-
Re: need compress for ObjectSerializer
Haitao Yao 2012-11-06, 04:35
Oh , sorry , I just found it in the JIRA. thanks.
Haitao Yao [EMAIL PROTECTED] weibo: @haitao_yao Skype: haitao.yao.final
On 2012-11-6, at 下午12:16, Roh wrote:
> It is already fixed as part of PIG-3017 > > Sent from my iPad > > On Nov 5, 2012, at 8:06 PM, Haitao Yao <[EMAIL PROTECTED]> wrote: > >> hi, all >> I think we need to optimize the org.apache.pig.impl.util.ObjectSerializer, because it uses java object serialization, which wastes a lot of space, so that it causes the tasktracker to OOME. here's the analyze result of tasktracker heap dump: >> <aa.jpg> >> This illustrates that the heap is retained by the JobConf objects, and we known jobconf contains a lot of Key-value strings. >> >> >> >> So here's the statistics of heap retention: >> <bb.jpg> >> >> >> And dive into the object histogram, here it is: >> <cc.jpg> >> >> >> And here's the source code: >> <dd.jpg> >> >> >> So I think we need to compress the output of object serializer. I'm submitting my patch. >> >> >> >> >> >> Haitao Yao >> [EMAIL PROTECTED] >> weibo: @haitao_yao >> Skype: haitao.yao.final >>
+
Haitao Yao 2012-11-06, 04:35
|
|