Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> optimization for data cube


Copy link to this message
-
Re: optimization for data cube
Thank you very much.
We're using Pig-0.9.2. I updated to 0.11 but it took an unacceptable time to compile my big pig script. With Pig-0.9.2, it's OK. I still did not find the reason.

So, I think I need migrate the cube operation to 0.9.2 by myself.
Haitao Yao
[EMAIL PROTECTED]
weibo: @haitao_yao
Skype:  haitao.yao.final

在 2013-4-3,下午1:19,Prasanth J <[EMAIL PROTECTED]> 写道:

> From 0.11 release onwards Pig natively supports CUBE operator.
>
> Here is the documentation for CUBE operator http://pig.apache.org/docs/r0.11.1/basic.html#cube
>
> For your case the query can be represented as
>
> cubed = CUBE input BY CUBE(group_a,group_b,group_c);
> output = FOREACH cubed GENERATE FLATTEN(group) as (group_a,group_b,group_c), FLATTEN(cube.value) as value;
>
> More examples can be found in documentation.
>
> Thanks
> -- Prasanth
>
> On Apr 2, 2013, at 11:34 PM, Haitao Yao <[EMAIL PROTECTED]> wrote:
>
>> Hi, all
>> I have a tuple like this:
>> (group_a,group_b,group_c,value)
>>
>> and I want to calculate the values in a data cube way, which means I want to generate new tuples from the original one :
>>
>> (all,all,all,value)
>> (group_a,all,all,value)
>> (all,group_b,all,value)
>> (group_a,group_b,all,value)
>> (all,all,group_c,value)
>> (group_a,all,group_c,value)
>> (all,group_b,group_c,value)
>>
>> and then group by ($0, $1, $2) .
>> How can I do this? I've wrote a Eval function, but it can not generate more tuples from one tuple.
>>
>>
>> thanks.
>>
>>
>> Haitao Yao
>> [EMAIL PROTECTED]
>> weibo: @haitao_yao
>> Skype:  haitao.yao.final
>>
>

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB