Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - How can I split the data with more reducers?


+
Haitao Yao 2012-09-16, 02:08
+
Haitao Yao 2012-09-16, 02:50
+
Dmitriy Ryaboy 2012-09-16, 08:41
Copy link to this message
-
Re: How can I split the data with more reducers?
Haitao Yao 2012-09-16, 09:05
here's the explain result compressed.(The apache mail server does not allow big attachments.)

Haitao Yao
[EMAIL PROTECTED]
weibo: @haitao_yao
Skype:  haitao.yao.final

On 2012-9-16, at 下午4:41, Dmitriy Ryaboy wrote:

> Still would like to see the script or the explain plan..
>
> D
>
> On Sat, Sep 15, 2012 at 7:50 PM, Haitao Yao <[EMAIL PROTECTED]> wrote:
>> No, I also thought it is a mapper , but It surely is a reducer. all the mappers succeeded and the reducer failed.
>>
>>
>>
>> Haitao Yao
>> [EMAIL PROTECTED]
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>> On 2012-9-16, at 上午10:08, Haitao Yao wrote:
>>
>>> Hi,
>>>      I 'v encountered a problem: the job failed because of POSplit retained too much memory in the reducer. How can I specify more reducers for the spill?
>>>
>>>      Here's the screen snapshot of the Heap dump.
>>>      <aa.jpg>
>>>
>>>
>>> And here's the snippet of my split script:
>>>
>>>      split RawData into AURawData if type == 2, NURawData if type == 1, InRawData if type == 9, GCData if type == 61, HCData if type == 71, TutorialRawData if type == 3 or t    ype == 15;
>>>
>>> There's 3 similar split clause in my script. The reducer count is always 1. How can I increase it?
>>>
>>> Thanks.
>>>
>>>
>>>
>>> Haitao Yao
>>> [EMAIL PROTECTED]
>>> weibo: @haitao_yao
>>> Skype:  haitao.yao.final
>>>
>>

+
Haitao Yao 2012-09-16, 09:18
+
Dmitriy Ryaboy 2012-09-17, 05:01
+
Haitao Yao 2012-09-17, 08:53
+
Dmitriy Ryaboy 2012-09-17, 09:07
+
Haitao Yao 2012-09-17, 09:26
+
Dmitriy Ryaboy 2012-09-16, 02:39