Haitao Yao 2013-04-03, 03:34
Prasanth J 2013-04-03, 05:19
-Re: optimization for data cube
Haitao Yao 2013-04-03, 06:07
Thank you very much.
We're using Pig-0.9.2. I updated to 0.11 but it took an unacceptable time to compile my big pig script. With Pig-0.9.2, it's OK. I still did not find the reason.
So, I think I need migrate the cube operation to 0.9.2 by myself.
在 2013-4-3，下午1:19，Prasanth J <[EMAIL PROTECTED]> 写道：
> From 0.11 release onwards Pig natively supports CUBE operator.
> Here is the documentation for CUBE operator http://pig.apache.org/docs/r0.11.1/basic.html#cube
> For your case the query can be represented as
> cubed = CUBE input BY CUBE(group_a,group_b,group_c);
> output = FOREACH cubed GENERATE FLATTEN(group) as (group_a,group_b,group_c), FLATTEN(cube.value) as value;
> More examples can be found in documentation.
> -- Prasanth
> On Apr 2, 2013, at 11:34 PM, Haitao Yao <[EMAIL PROTECTED]> wrote:
>> Hi, all
>> I have a tuple like this:
>> and I want to calculate the values in a data cube way, which means I want to generate new tuples from the original one :
>> and then group by ($0, $1, $2) .
>> How can I do this? I've wrote a Eval function, but it can not generate more tuples from one tuple.
>> Haitao Yao
>> [EMAIL PROTECTED]
>> weibo: @haitao_yao
>> Skype: haitao.yao.final