How can I split the data with more reducers?
Haitao Yao 2012-09-16, 02:08
I 'v encountered a problem: the job failed because of POSplit retained too much memory in the reducer. How can I specify more reducers for the spill?

Here's the screen snapshot of the Heap dump.

And here's the snippet of my split script:

split RawData into AURawData if type == 2, NURawData if type == 1, InRawData if type == 9, GCData if type == 61, HCData if type == 71, TutorialRawData if type == 3 or t    ype == 15;

There's 3 similar split clause in my script. The reducer count is always 1. How can I increase it?


Haitao Yao
weibo: @haitao_yao
Skype:  haitao.yao.final