Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - need compress for ObjectSerializer


+
Haitao Yao 2012-11-06, 04:06
+
Roh 2012-11-06, 04:16
Copy link to this message
-
Re: need compress for ObjectSerializer
Haitao Yao 2012-11-06, 04:35
Oh , sorry , I just found it in the JIRA.
thanks.

Haitao Yao
[EMAIL PROTECTED]
weibo: @haitao_yao
Skype:  haitao.yao.final

On 2012-11-6, at 下午12:16, Roh wrote:

> It is already fixed as part of PIG-3017
>
> Sent from my iPad
>
> On Nov 5, 2012, at 8:06 PM, Haitao Yao <[EMAIL PROTECTED]> wrote:
>
>> hi, all
>> I think we need to optimize the org.apache.pig.impl.util.ObjectSerializer, because it uses java object serialization, which wastes a lot of space, so that it causes the tasktracker to OOME. here's the analyze result of tasktracker heap dump:
>> <aa.jpg>
>> This illustrates that the heap is retained by the JobConf objects, and we known jobconf contains a lot of Key-value strings.
>>
>>
>>
>> So here's the statistics of heap retention:
>> <bb.jpg>
>>
>>
>> And dive into the object histogram, here it is:
>> <cc.jpg>
>>
>>
>> And here's the source code:
>> <dd.jpg>
>>
>>
>> So I think we need to compress the output of object serializer. I'm submitting my patch.
>>
>>
>>
>>
>>
>> Haitao Yao
>> [EMAIL PROTECTED]
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>